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@ DNA sequences useful for the synthesis of carotenolds. 



® Disclosed are DNA sequences which are useful for the synthesis of carotenoids such as lycopene. ^- 
carotene, zeaxanthin or zeaxanthin-diglucoside, that is, DNA sequences encoding carotenoid biosynthesis 
enzymes. These DNA sequences are the sequences © - ®, respectively, shown in the specification. 

Also disclosed is a process for producing a carotenoid compound which is selected from the group 
consisting of prephytoene pyrophosphate, phytoene, lycopene. i8-carotene. zeaxanthin and zeaxanthin- 

<i diglucoside, which comprises transforming a host with at least one of the DNA sequences ® • ® described 

^ above and culturing the transformant 
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ONA SEQUENCES USEFUL FOR THE SYNTHESIS OF CAROTENOIDS 



BACKGROUND OF THE INVENTION 



5 Field of the Art 

The present invention relates to DNA sequences which are useful for the synthesis of carotenoids such 
as lycopene, /S-carotene, zeaxanthin or zeaxanthin-diglucoside. 

The present invention also relates to processes for producing such carotenoid compounds. 

w 

Related Art 

Carotenoids are distributed widely in green plants. They are yellow-orange-red lipids which are also 

IS present in some mold, yeast and so forth, and have recently received increased attention as natural coloring 
materials for foods. Among these carotenoids. ^-carotene is a typical one, which is used as a coloring 
materials and as a precursor of vitamin A in mammals as well. It Is also examined for its use as a 
component for preventing cancer [see. for example, SHOKUHIN TO KAIHATSU (Foods and Development). 
24. 61-65 (1989)]. Carotenoids such as iS-carotene are widely distributed in green plants, so that the plant 

20 tissue culture has been examined for the development of a method for producing carotenoids ip a large 
amount which is free from the influence of natural environment [see, for example. Plant Cell Physiol.. 12, 
525-531 (1971)]. The examination has been also made for detecting a microorganism such as mold, yeast 
or green algae which is originally high carotenoid productive and for- producing carotenoids in a large 
amount with use of such microorganism (see, for example, The Abstract of Reports in the Annual Meeting 

25 of NIPPON HAKKO KOGAKU-KAI of 1988. page 139). However, neither of these methods are successful at 
present in producing j8-carotene at a good productivity which exceeds the synthetic method in commercial 
production of i8-carotene. It would be very useful to obtain a gene group which participates in the 
biosynthesis of carotenoids, because it will t>e possible to produce carotenoids in a large amount by 
introducing a gene group which has been reconstructed to express proper genes in the gene group in a 

30 large amount, into an appropriate host such as a plant tissue culture cell, a mold, an yeast or the like which 
originally produces carotenoids. Such a development in technology has possibilities for finding a method of 
producing ^-carotene superior to the synthetic method and a method of producing useful carotenoids other 
than iS-carotene in a large amount. 

Furthermore, the synthesis of carotenoids in a cell or an organ which produces no carotenoid will be 

35 possible by obtaining the gene group participating in the biosynthesis of carotenoids, which will add new 
values to organisms. For example, several reports have recently been made with reference to creating 
flower colors which cannot be found in nature by using genetic manipulation in flowering plants [see, for 
example. Nature. 330. 677-678 (1987)]. The color of flowers Is developed by pigments such as an- 
thocyanine or carotenoids. Anthocyanine is responsible for flower colors in the spectrum of red-violet-blue, 

40 and carotenoids are responsible for flower colors in the spectrum of yellow-orange-red. The gene of the 
enzyme for synthesizing anthocyanine has been elucidated, and the aforementioned reports for creating a 
new flower color are those refenring to anthocyanine. On the other hand, there are many flowering plants 
having no bright yellow flower due to no function of synthesizing carotenoids in petal (e.g. petunia, 
saintpaulia (african violet), cyclamen. Primula malacoides, etc.). If suitable genes having been reconstructed 

45 SO as to be expressed in petal in a gene group refemng to the biosynthesis of carotenoids are introduced 
Into these flowering plants, the flowering plants having yellow flowers will be created successfully. 

However, enzymes for synthesizing carotenoids or genes coding for them have been scarcely 
elucidated at present The nucleotide sequence of the gene group participating in the biosynthesis of a kind 
of carotenoids has been elucidated lately only in a photosynthetic bacterium Rhodobacter capsulatus [Mol. 

so Gen. Genet.. 216, 254-268 (1989)], But this bacterium synthesizes the acyclic xanthophyll spheroidene via 
neurosporene without cycllzation and thus cannot synthesize general carotenoids such as lycopene. iS- 
carotene and zeaxanthin. 

There are prior arts with reference to yellow pigments or carotenoids of Erwlnla species disclosed in J. 
Bacterid., 168, 607-612 (1986). J. BacterioL. 170, 4675-4680 (1988) and J. Gen. Microbiol.. 130, 1623-1631 
(1984). The first one of these references discloses the cloning of a gene cluster coding for yellow pigment 
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synthesis from Efwinia herbicola Eho 10 ATCC 39368 as a 12.4 kilobase pair (kb) fragment. In this 
connection, there is no illustration of the nucleoticle sequence of the 12.4 kb fragment The second literature 
discloses the yellow pigment synthesized by the cloned gene cluster, which Is indicated to belong to 
carotenoids by the analysis of its UV-vislble spectrum. The last literature indicates that the gene participat- 
5 Ing in the production of a* yellow pigment is present in a 260 kb large plasmid contained in Envinia 
uredovora 2003 ATTC 19321 from the observation that the yellow pigment is not produced on curing the 
large plasmid, and further discloses that the pigment belongs to carotenoids from the analysis of Its UV- 
visible spectrum. 

However, the chemical structures of carotenoids produced by the Erwinia species or of its metabolic 
10 intenmediates, enzymes participating in the synthesis of them or the nucleotide sequence of the genes 
encoding these enzymes remain unknown at present 



DISCLOSURE OF THE INVENTION 

IS 



Outline of the invention 

20 The object of the present invention is to provide DNA sequences which are useful for the synthesis of 
carotenoids such as lycopene. i3-carotene, zeaxanthin or zeaxanthin-dlglucoside, that is ONA sequences 
encoding carotenoid biosynthesis enzymes. 

In other words, the DNA sequences useful for the synthesis of carotenoids according to the present 
Invention are the DNA sequences ® - © described in the following (1) - (6). 

25 (1) a DNA sequence encoding a polypeptide which has an enzymatic activity for converting 

prephytoene pyrophosphate into phytoene and whose amino acid sequence corresponds substantially to 
the amino acid sequence from A to 8 shown in Rgs. 1-{a) and (b) (DNA sequence ®); 

(2) a DNA sequence encoding a polypeptide which has an enzymatic activity for converting 
zeaxanthin into zeaxanthin-dlgiucoside and whose amino acid sequence corresponds substantially to the 

30 amino acid sequence from C to D shown in Rgs, 2-(a) and (b) (DNA sequence @); 

(3) a DNA sequence encoding a polypeptide which has an enzymatic activity for converting lycopene 
into i8-carotene and whose amino acid sequence corresponds substantially to the amino acid sequence 
from E to F shown in Figs. 3-(a) and (b) (DNA sequence @); 

(4) a DNA sequence encoding a polypeptide which has an enzymatic activity for converting phytoene 
OS into lycopene and whose amino acid sequence con^esponds substantially to the amino acid sequence from 

G to H shown in Figs. 4-{a), (b) and (c) (DNA sequence 0); 

(5) a DNA sequence encoding a polypeptide which has an enzymatic activity for converting 
geranylgeranyl pyrophosphate into prephytoene pyrophosphate and whose amino acid sequence cor- 
responds substantially to the amino acid sequence from 1 to J shown In Figs. 5-(a) and (b) (DNA sequence 

40 ©); and 

(6) a DNA sequence encoding a polypeptide which has an enzymatic activity for converting fi- 
carotene into zeaxanthin and whose amino acid sequence corresponds substantially to the amino acid 
sequence from K to L shown in Rg. 6 (DNA sequence ®). 

Another object of the present invention is to provide processes for producing carotenoid compounds. 
45 More specifically, the present invention also provides a process for producing a carotenoid compound 
which. Is related from the group consisting of prephytoene pyrophosphate, phytoene, lycopene. i3-carotene. 
zeaxanthin and zeaxanthin-diglucoside. which comprises transforming a host with at least one of DNA 
sequences ® * ® described above and culturing the transformant 

50 

Effect of the Invention 

The successful acquirement of the gene group (gene group encoding the biosynthetic enzymes of 
carotenoids) useful for the synthesis of carotenoids such as lycopene, iS-carotene, zeaxanthin. zeaxanthin- 
55 diglucoside or the like according to the present invention has made it possible to produce useful 
carotenoids in large amounts, for example, by creating a plasmid in which the gene(s) can be expressed In 
a large amount and employing an appropriate plant tissue culture cell, a microorganism or the like 
transformed with the plasmid. The success in acquiring the gene group useful for the synthesis of 
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carotenojds such as lycopene, i3*carotene, zeaxanthin, zeaxanthin-diglucoside or the like according to the 
present invention has made it possible to synthesize carotenoids in cells or organs which produce no 
carotenoid by creating a pfasmid in which the gene(s) can be expressed in a target ceil or organ and 
transforming a suitable host with this plasmid. 



DETAILED DESCRIPTION OF THE INVENTION 



10 The DNA sequences according to the present Invention are the aforementioned ONA sequences Q) - ® 
. that Is, genes encoding the polypeptides of respective enzymes which participate in the biosynthesis 
reaction of carotenoids. In particular, for example, such polypeptides In Enwinia uredovora 20D3 ATCC 
19321. 

A variety of gene groups containing the ONA sequences of a combination of a plurality of sequences 
15 among these DNA sequences ® - ® can be expressed in a microorganism, a plant or the like to afford 
them the biosynthesis ability of carotenoids such as lycopene. iS-carctene, zeaxanthin, zeaxanthin- 
diglucoside or the like. The respective ONA sequences constructing the gene group may be present on a 
ONA strand or on different DNA strands individually, or optionally, the respective DNA sequences may 
comprise a plurality of ONA sequences present on a DNA strand and a DNA sequence present on another 
20 DNA strand. 

The aforementioned gene group encode the polypeptides of a plurality of enzymes participating in the 
production of carotenoids. A recombinant DNA Is created by Incorporating the gene group into a proper 
vector and then introduced into a suitable host to create a transformant, which is cultured to produce, mainly 
in the transformant a plurality of enzymes participating In the fonrnation reaction of carotenoids and to 

25 conduct the biosynthesis of carotenoids in the transformant by these enzymes. 

The DNA sequence shown in Fig. 7-(a) to (g), which is an example according to the present Invention, 
is acquired from En/yinia uredovora 20D3 ATCC 19321 and thus exhibits, as illustrated in the experimental 
example below, no homology in the DNA-DNA hybridization with the DNA strand containing the gene group 
for synthesizing the yellow pigment of Emma herblcola Eho 10 ATCC 39368 (see Related Art described 

30 above). 



DNA Sequences encoding the polypeptide of each enzyme 

35 The ONA sequences of the present invention are the DNA sequences ® - ® (or the DNA strands ® - 
®), respectively. Each of the DNA sequences contains a nucleotide sequence encoding the polypeptide 
whose amino acid sequence conresponds substantially to such an amino acid sequence as in the 
■ aforementioned specific regions in Rgs, 1 - 6 (for example, from A to B in Fig. 1). In this connection the 
term "DNA sequence" means a polydeoxyrlbonucleic acid sequence having a length. In the present 

40 Invention, the "ONA sequence" is defined by an amino acid sequence of a polypeptide which Is encoded 
by the DNA sequence and has a definite length as described above, so that each DNA sequence has also a 
definite length. However, the DNA sequence contains a gene encoding each enzyme and is useful for 
biotechnoioglcal production of the polypeptide, and such biotechnological production cannot be performed 
by only the DNA sequence having a definite length but can be performed in the state where other DNA 

45 sequence with a proper length is linked to the s'-upstream and/or the 3'-downstream of the DNA sequence. 
Therefore, the term "DNA sequence" in the present invention includes, in addition to those having a definite 
length (for example, the length in the region of A - B in the corresponding amino acid sequence of Fig. 1), 
those in the form of a linear DNA strand or a circular DNA strand containing the DNA sequence having a 
definite length as a member. 

50 One of the typical fonms of each ONA sequence according to the present invention is a form of a 
plasmid which comprises the ONA sequence as a part of a member or a form in which the plasmid Is 
present in a host such as E. The plasmid as one of the preferable existing forms of each DNA 
sequence according to the present invention Is a conjunction of the DNA sequence according to the present 
Invention as a passenger or a foreign gene, a repllcable plasmid vector present stably in a host and a 

55 promoter (containing ribosome-binding sites in the case of a procaryote). As the plasmid vector and the 
promoter, an appropriate combination of those which are well-known can be used. 
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Polypeptides encoded by DNA sequences 

As mentioned above, the DNA sequences according to the present invention are respectively specified 
by the amino acid sequences of the polypeptides encoded thereby. Each of these polypeptides is the one 

5 having an amino acid sequence which corresponds substantially to an amino acid sequence in a specific 
region as described above in Figs. 1 • 6 (for example, from A to 6 in Rg. 1). Here, in the six (A-B, C-D, E-F. 
G-H, l-J, K-L) polypeptides shown in Figs. 1-6 (i.e. six enzymes participating in the formation of 
carotenoids). some of the amino acids can be deleted or substituted or some amino acids can be added or 
inserted, etc.. so long as each polypeptide has the aforementioned enzymatic activity in the relationship of a 

70 substrate and a converted substance (a product). This is indicated by the expression "whose amino acid 
sequence con^esponds substantially to ..." In the claims. For example, each polypeptide that first amino acid 
(Met) has been deleted from each polypeptide shown in Rgs. 1 - 6 is included in such deleted 
polypeptides. 

The typical polypeptides having enzymatic activities, respectively, in the present invention are tiiose in 
15 the specific regions in Rgs. 1 - 6 described above, and the amino acid sequences of these polypeptides 
have not been known. 



Nucleotide sequences of DNA sequences 

20 

The DNA sequences encoding the respective enzymes are tiiose. having tine nucleotide sequences in 
tiie aforementioned specific regions in Rgs. 1 - 6 (for example. A-B in Rg. 1) or degenerative isomers 
tiiereof. or those having tiie nucleotide sequences conresponding to tiie aforementioned alteration of the 
amino acid sequence of respective enzymes or degenerative Isomers thereof. The term "degenerative 

25 isomer" means DNA sequence which is different only in degenerative codon and can code for the same 
polypeptide. The preferred embodiments of tiie DNA sequences according to tiie present invention are 
tiiose having at least one stop codon (such as TAA) at the s'-temilnai. The s'-upstream and/or tiie 3'- 
downstream of tiie DNA sequences according to tiie present invention may furtiier have a DNA sequence 
with a certain length as a non-translation region (the initial portion of the 3 -downstream being usually a stop 

30 codon such as TAA). 



Gene group used for the synthesis of carotenoids 

35 The gene group (the gene cluster in some case) used for tiie synthesis of carotenoids comprises a 

plurality of the aforementioned DNA sequences ® - (S), whose typical examples are illustrated in the 
following (1) • (4). Each gene group encodes a plurality of polypeptides of respective enzymes and these 
enzymes participate in tiie production reaction of carotenoids to produce them from their substrates. 

40 

(1) Gene group used for tiie synthesis of lycopene 

The gene group used for the synthesis of lycopene which Is a red carotenoid is DNA sequence 
comprising tiie aforementioned DNA sequences 0. 0 and (§), and such a gene group includes the one in 
45 which respective DNA sequences are present on one DNA strand or on different DNA strands separately or 
the one which is constructed by the combination of the aforementioned ones according to necessities. 

In tiie case tiiat a plurality of DNA sequences are present on one DNA strand, the arrangement order 
and direction of the aforementioned DNA sequences Q , ® and (§) may be optional provided that the 
genetic information is capable of expression, tiiat is to say respective genes In a host are in a state of being 
50 transcribed and translated appropriately. 

The biosynthetic pathway of lycopene in E. coli is explained as follows: geranyigeranyl pyrophosphate 
which is a substrate originally present In E coi[ is converted Into prephytoene pyrophosphate by the 
enzyme encoded by the DNA sequence "® . tiie prephytoene pyrophosphate Is tiien converted into 
phytoene by tiie enzyme encoded by the DNA sequence ®. and tfie phytoene is further converted into 
55 lycopene by the enzyme encoded by tiie DNA sequence 0 (see Rg. 8). 

Lycopene is a carotene whose color is red. Lycopene is a red pigment which is present in a large 
amount in the fruits of water melon or tomato and has high safety for food. In tiiis connection, the lycopene 
which was syntiiestzed by the DNA sequences according to tiie present Invention in the experimental 
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example described below had the same stereochemistry as lycopene present in these plants. 

One of the typical existing forms of the gene group of the present invention is a form of a piasmid 
which comprises the respective ONA sequences containing a stop codon as a member or a form in which 
the piasmid is present in a host such as E. coll. The piasmid which is one of the preferred existing forms of 

5 the gene group according to the present invention comprises a gene group as a -passenger or a foreign 
gene, a replicable piasmid vector present stably in a host and a promoter (containing ribosome-blnding 
sites in ttie case of a procaryote). As tiie promoter, in procaryotes such as E. coll or Zymomonas species a 
promoter which Is common to respective ONA sequences can be used, or alternatively respective 
promoters can be used to the respective ONA sequences. In the case of eucaryotes such as yeast or plant. 

70 respective promoters are preferably used to respective ONA sequences. 

One of tiie preferred existing forms of tiie ONA sequences are described above In tiie explanation of 
the ONA sequences (3) - ®. 



rs (2) Gene group used for tiie syntiiesis of iS-carotene 

The gene group used for the synthesis of ^-carotene which is one of yellow-orange carotenoids is a 
ONA sequence comprising tiie aforementioned ONA sequences 0. ® and ®. in otiier words, the 
gene group used for tiie synthesis of i3-carotene is formed by adding tiie ONA sequence ® to a ONA 

20 sequence used for the syntiiesis of lycopene comprising tiie ONA sequences and ®. @, and ©. The 
gene group includes tiie one in which the respective ONA sequences constiucting the gene group may be 
present on one ONA strand or on different ONA strands individually, or the one which is constructed by tiie 
combination of the aforementioned ones according to necessities. 

In the case tiiat a plurality of ONA sequences are present on one ONA strand, tiie arrangement order 

25 and direction of tiie aforementioned ONA sequences (T). ®. ® and ® may be optional provided that the 
genetic information is capable of expression, tiiat is to say respective genes in a host are in a state of being 
transcribed and translated appropriately. 

The biosynthetic pathway of i5-carotene in E. coli is explained as follows: geranylgeranyl pyrophosphate 
which is a substrate originally present in E. coli is converted into prephytoene pyrophosphate by tiie 

30 enzyme encoded by tiie DNA sequence ®. the prephytoene pyrophosphate is converted into phytoene by 
tiie enzyme encoded by the ONA sequence ®, the phytoene is furtiier converted into lycopene by tiie 
enzyme encoded by the ONA sequence ®, and the lycopene is further converted into iS-carotene by tiie 
enzyme encoded by the DNA sequence ®, (see Fig, 8). 

/3-carotene is a typical carotene whose color is in tiie spectrum ranging from yellow to orange, and it Is 

35 an orange pigment which is present in a large amount in tiie roots of carrot or green leaves of plants and 
has high safety for food. The utility of /9-carotene has already been described in the explanation of related 
art. In tills connection, the /3-carotene which was synthesized by the DNA sequence according to the 
present invention in the experimental example described below had the same stereochemistry as /3- 
carotene present in tiie roots of carrot or green leaves of plants. 

40 One of the typical existing forms of tiie gene group and tiie individual ONA sequences are tiie same as 
defined in (1). 



(3) Gene group used for the synthesis of zeaxantiiin 

45 

The gene group used for the synthesis of zeaxanthin which is one of yellow-orange carotenoids is a 
DNA sequence comprising tiie aforementioned ONA sequences (T), 0. ® and (g). In other words, the 
ONA sequence used for tiie syntiiesis of zeaxantiiin is formed by adding tiie ONA sequence ® to a ONA 
sequence used for the synthesis of )3-carotene comprising the DNA sequences ®, ®, ®. ® . The gene 
50 group includes the one in which the respective ONA sequences constructing the gene group are present on 
one DNA sti'and or on different DNA strands Individually, or tfie one which is constructed by ttie 
combination of the aforementioned ones according to necessities. 

In the case that a plurality of ONA sequences are present on one ONA strand, tiie anrangement order 
and direction of the aforementioned DNA sequences ® . ®. ®, ® and © may be optional provided tiiat 
55 the genetic Information is capable of expression, that is to say respective genes in a host are in a state of 
being transcribed and translated appropriately. 

The biosynthetic patiiway of zeaxantiiin in E. coli Is explained as follows: geranylgeranyl pyrophosphate 
which is a substrate originally present in E. coli is converted Into prephytoene pyrophosphate by tiie 
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enzyme encoded by the DNA sequence © the prephytoene pyrophosphate is converted into phytoene by 
the enzyme encoded by the DNA sequence ®, the phytoene is then converted into lycopene by the 
enzyme encoded by the DNA sequence @. and the lycopene is further converted Into i3-carotene by the 
enzyme encoded by the DNA sequence ®, and finally the 0- carotene is converted into zeaxanthin by the 
5 enzyme encoded by the ONA sequence © (see Rg. 8). 

Zeaxanthin is a xanthophyll whose color is In the spectrum ranging from yellow to orange, and it is an 
yellow pigment which is present in the seed of maize and has high safety for food. Zeaxanthin is contained 
in feeds for hen or colored carp and is an important pigment source for coloring them. In this connection, 
the zeaxanthin which was synthesized by the ONA sequences according to the present invention In the 
JO experimental example described below had the same stereochemistry as zeaxanthin described above. 

One of the typical existing forms of the gene group and the Individual DNA sequences is the same as 
defined in (1). 

15 (4) Gene group used for the synthesis of zeaxanthin-diglucoside 

The gene group used for the synthesis of zeaxanthin-diglucoside which is one of yellow-orange 
carotenolds Is a DNA sequence comprising the aforementioned DNA sequences ® - ®. In other words, 
the gene group used for the synthesis of zeaxanthin-diglucoside is fomied by adding the DNA sequence ® 
20 to a DNA sequence used for the synthesis of zeaxanthin comprising the DNA sequences ®. ®, @, ® 
and ®. The gene group includes the one In which the respective DNA sequences constructing the gene 
group are present on one DNA strand or on different DNA strands individually, or the one which is 
constructed by the combination of the aforementioned ones according to necessities. 

In the case that a plurality of DNA sequences are present on one DNA strand, the arrangement order 
25 and direction of the aforementioned DNA sequences (3) - ® may be optional provided that the genetic 
information is capable of expression, that is to say respective genes in a host are in a state of being 
transcribed and translated appropriately. 

One of the typical existing forms of the gene group and the individual DNA sequences is the same as 
defined In (1). 

30 The biosynthetic pathway of zeaxanthin-diglucoside in E; coli is explained as follows: geranylgeranyl 
pyrophosphate which is a substrate originally present in E coll is converted into prephytoene 
pyrophosphate by the enzyme encoded by the DNA sequence (^Tthe prephytoene pyrophosphate is 
converted into phytoene by the enzyme encoded by the DNA sequence ®. the phytoene is then converted 
into lycopene by the enzyme encoded by the DNA sequence 0, and the lycopene is further converted into 

35 /8-carotene by the enzyme encoded by the DNA sequence ®. the /5-carotene is then converted into 
zeaxanthin by the enzyme encoded by the DNA sequence ®, and the zeaxanthin Is finally converted into 
zeaxanthin-diglucoside by the enzyme encoded by the DNA sequence ® (see Rg. 8). 

Zeaxanthin-diglucoside is a carotenoid glycoside having a high water solubility and a pigment which is 
soluble sufficiently in water at room temperature and exhibits clear yellow. Carotenoid pigments are 

40 generally hydrophobic and thus limited on their use as natural coloring materials in foods or the like. 
Therefore, zeaxanthin-diglucoside settles this defect. Zeaxanthin-diglucoside is isolated from edible plant 
saffron, Croccus sativus (Pure & Appi. Chem.. 47. 121-128 (1976)). so that it is thought that its safety for 
food has been confirmed. Therefore, zeaxanthin-diglucoside is desirable as a yellow natural coloring 
material of foods or the like. In this connection, there has been heretofore no reports with reference to the 

45 isolation of zeaxanthin-diglucoside from microorganisms. 

If carotenoid pigments such as lycopene, j8-carotene, zeaxanthin and zeaxanthin-diglucoside are 
intended to be produced, the aforementioned DNA sequences ®. ® and ®. the DNA sequences ®. ®. 
@ and ®. the DNA sequences ®. ®. ®, ® and ®, and the DNA sequences ® - ® are required, 
respectively, on using E coli as the host. However, when a host other than E. coH, particularly the one 

50 which is capable of producing carotenoids is used, it has a high possibility of containing also carotenoid 
precursors at further downstream in the biosynthesis, so that ail of the aforementioned DNA sequences ®. 
® and ® (for the production of lycopene), all of the DNA sequences ®. ®, ® and ® (for tiie production 
of iS-carotene). all of the DNA sequences ®, @, 0, ® and ® (for the production of zeaxanthin). or all of 
the DNA sequences ® - ® (for the production of zeaxantiiln-diglucoside) are not always required. 

55 That is to say, only the DNA sequence($) participating in the fomnation of an aimed carotenoid pigment 
from a carotenoid precursor present at tiie furthest downstream in the host may also be used in tiiis case. 
Thus, when lycopene is intended to be produced as an aimed carotenoid in a host in which phytoene is 
preliminarily present, it is also possible to use only the DNA sequence @ among the DNA sequences ®, 
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0 and ®. 

It Is also possible to make a host to produce, as the aimed carotenoid pigment relating compound, 
prephytoene pyrophosphate from geranylgeranyl pyrophosphate by using only the ONA sequence © of the 
present invention, or phytoene by using the DNA sequences 0 and (§) of the present invention or, if the 
5 host contains prephytoene pyrophosphate, by using only the ONA sequence 0 . 



Acquirement of DNA sequences 

w A method for acquiring the DNA sequences ® - ® which contain the nucleotide sequences coding for 
the amino acid sequences of the respective enzymes is the chemical synthesis of at least a part of their 
strand by the method of polynucleotide synthesis. However, if it is taken into consideration that a number of 
amino acids are bonded, it would be more preferable than the chemicaJ synthesis to acquire the DNA 
sequences from the DNA library of Enwinia uredovora 20D3 ATTC 19321 according to a conventional 

75 method in the field of genetic engineering, for example, the hybridization method with a suitable probe. 

The individual DNA sequences or the DNA sequence comprising all of these sequences are thus 
obtained. 



20 Transformant 

The aforementioned gene group comprising a plurality of the DNA sequences ® - ® can be 
constituted by using the DNA sequences obtained as described above. The ONA sequence thus obtained 
contains genetic informations for making an enzyme participating in the formation of carotenoids. so that it 
as can be Introduced into an appropriate host by the biotechnologlcaJ method to form a transfonmant and to 
produce an enzyme and in its turn a carotenoid pigment or a carotenoid pigment relating compound. 



(1) Host 

30 

Plants and a variety of microorganisms, as far as a suitable host-vector system is present, can be the 
target of transfonnation by a vector comprising the aforementioned ONA sequences. However, the host is 
required to contain geranylgeranyl pyrophosphate which is a. substrate compound of an enzyme for starting 
the carotenoid synthesis with use of the DNA sequences of the present invention, or a compound further 
35 downstream from it. 

It is known that geranylgeranyl pyrophosphate is synthesized by dimethylallyltransferase which is a 
common enzyme at the initial stage of the biosynthesis of not only carotenoids but also sterols or terpenes 
[J. Biochem., 72, 1101-1108 (1972)]. Accordingly, if a cell which cannot synthesize carotenoids can 
synthesize sterols or terpenes, it probably contains geranylgeranyl pyrophosphate. It is believed that a cell 
40 contains at least one of sterols or perpenes. 

Therefore; it is believed theoretically that almost all hosts are capable of synthesizing carotenoids by 
using the ONA sequences of the present invention as far as a suitable host-vector system Is present. 

As the hosts in which the host-vector system is present, there are mentioned plants such as Nicotiana 
tabacum, Petunia hybrida and the like, microorganism such as bacteria, for example Escherichia coll . 
45 Zymomonas mobllis and the like, and yeasts, for example Saccharomyces cerevisiae and the like. 



(2) Transformation 

50 tt is confirmed for the first time by the present invention that the genetic informations present on the 
DNA sequences of the present Invention has been expressed in microorganisms. However, the procedures 
or the methods for making the transformants (and the production of enzymes or In its turn carotenoid 
pigments or carotenoid pigment relating compounds by the transformants) are per se conventional in the 
fields of molecular biok)gy, cell biology or genetic manipulation, and thus the procedures other than 

55 described below may be performed in accordance with these conventional techniques. 

In order to express the gene of the ONA sequences according to the present Invention in a host, it is 
necessary to .Insert the gene Into a vector for Introducing it Into the host. As the vector used In this stage, 
there is used all of various known vectors such as pBI121 or the like for plants (Nicotiana tabacum. Petunia 
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hybrlda); pUC19, pACYCl84 or the like for E coli; PZA22 or the like for Zymomonas mobilis (see Japanese 
Patent Laid-Open Publication No. 228278/87); and YEpl3 or the like for yeast. 

On the other hand. It is necessary to transcribe the DNA sequence of the present Invention onto mRNA 
in order to express the gene of the DNA sequence In the host. For this purpose, a promoter as a signal for 

5 the transcription may be integrated into the s'-upstream region from the ONA sequence of the present 
invention. A variety of promoters such as CaMVSSS, NOS. TRi', TR2' (for plants); lac, Tc^ CAT. trp (for E. 
coli) ; Tc^ CAT (for Zymomonas mobilis) ; ADH1, GAL7. PGK, TRP1 (for yeast) and the like are known as for 
the promoters, and either of these promoters can be used in the present invention, 

in the case of procaryote, it is necessary to place ribosome-binding site (SD sequence in E. coli) 

10 several base-upstream from the initiation codon (ATG). " 

In this connection, while the aforementioned manipulation is necessary for producing the enzyme 
protein, one or more of amino acids may be inserted into or added to the polypeptide which is illustrated in 
the specific ranges of Rgs. 1 - 6 (e.g. the polypeptide A -B illustrated In Rg. 1), one or more of amino acids 
may be deleted, or replaced, as described above. 

76 The transformation of the host with the plasmid thus obtained can be conducted optionally by an 
appropriate method which is conventionally used in the fieids of genetic manipulation or cell biology. As for 
the general matters, there can be refen'ed to appropriate publications or reviews; for example as for the 
transfonnation of microorganisms. T. Maniatis, E. F. Fritsch and J. Sambrook: "Molecular Cloning A 
Laboratory Manual**, Cold Spring Harbor laboratory (1982). 

20 The transformant is the same as the host used, in its genotype, phenotype or bacteriological properties 
but for the new trait derived from the genetic information introduced by the DNA sequence of the present 
Invention (that is. the production of an enzyme participating in the carotenoid fonmation and the synthesis of 
carotenoids or the like by the enzyme), the trait derived from the vector used and the deletion of the trait 
conresponding to the deletion of a part of the genetic information of the vector which might be caused on 

25 the recombination of genes. Escherichia coil JM109 (pCAR1) which is an example of the transformant 
according to the present invention is deposited as FERM BP-2377. 



Expression of genetic information/production of carotenoids 

30 

The clone of the transformant obtained as described* above produces mainly in the transformant an 
enzyme participating In the carotenoid fomiation, and a variety of carotenoids or carotenoid pigment relating 
compounds are synthesized by the enzyme. 

Culture or the culturing condition of the transformant is essentially the same as those for the host used. 
35 Carotenoids can be recovered by the methods, for example, illustrated In Experimental Examples 3 and 
4 below. 

Furthermore, each enzyme protein coded by each DNA sequence of the present invention is produced 
mainly in the cell in the case of the transformation of E. coli, and it can be recovered by an appropriate 
method. 

40 

BRIEF DESCRIPTION OF THE DRAWINGS 



45 Rgs. 1 ■ 6 illustrate the nucleotide sequences In the DNA sequences ® - © in coding regions, and 

the amino acid sequences of proteins to be encoded, respectively, 

Rg. 7 illustrates the Kpn i -Hindi ll fragtment which was acquired from Erwinia uredovora 20D3 ATCC 

19321 and relates to the biosynthesis of carotenoids, that is the complete nucleotide sequence of the 6918 

bp DNA sequence containing the DNA sequences in Rgs. 1 - 6, and 
so Rg. 8 illustrates the function of the polypeptides encoded by the aforementioned DNA sequences 0 

-®. 

Experiments 

55 

All of Strains used In the following experiments are deposited In ATCC or other deposition organizations 
and are freely available. 
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Experimentai Example 1: Cloning of a gene cluster participating in the biosynthesis of a yellow pigment 
(referred to hereinafter as yellow pigment-synthesizing gene cluster) 



5 (1) Preparation of total DNA 

Total DNA was prepared from the cells of Erwinia uredovora 2003 ATCC 19321 which had been 
proliferated until the early-stationary phase In 100 ml of LB medium (1% tryptone, 0,5% yeast extract, 1% 
NaCI). Penicillin G (manufactured by Meiji Seiica) was added to the culture medium so that it has a 

10 concentration of 50 units/ml in the medium before 1 hour of the harvest of the cells. After harvesting the 
cells by centrifugation, this was washed with the TES buffer (20 mM tris. 10 mM EDTA, 0.1 M NaCI, pH 8), 
heat treated at 68* C for 15 minutes and suspended in Solution I (50 mM glucose, 25 mM Tris, 10 ml\4 
EOTA. pH 8) containing 5 mg/ml of lysozyme (manufactured by Seikagaku Kogyo) and 100 ug/mi of RNase 
A (manufactured by Sigma). The suspension was incubated at 37* C for a period of 30 minutes - 1 hour, 

;5 and pronase E (manufactured by Kaken Seiyaku) was added so that it had a concentration of 250 ag/mi 
before incubation at 37 *C for 10 minutes. Sodium N-lauroylsarcosine (manufactured by Nacalai tesque) 
was added so as it had the final concentration of 1%, and the mixture was agitated before incubation at 
37* C for several hours. Extraction was conducted several times with phenol/chloroform. While ethanol in 
volume of 2 equivalents was slowly added, the resulting total ONA was wound around a glass stick, rinsed 

20 with 70% ethanol and dissolved in 2 mt of TE buffer (10 mM Tris, 1 mM EDTA, pH 8) to give the total ONA 
preparation. 

(2) Construction of an Escherichia coli cosmid library and acquirement of & coli transformants producing 
as yellow pigments 

Incubation was conducted with 1 unit of restriction enzyme Sau3AI per 50 ul of the total DNA 
preparation at 37'C for 30 minutes before the Inactivation treatment of the restriction enzyme at 68*C for 
10 minutes. Many fragments partially digested with Sau3AJ were obtained in the neighbourhood of 40 kb 

30 under this condition. After the ethanol precipitation of this reaction solution, this half portion was mixed with 
2,5 HQ of cosmid PJB8 which had been digested with BamHI and treated with alkaline phosphatase and 0.2 
ng of a pJB8 Sall-BamHI right arm fragment (smaller fragment) which had been recovered from a gel, and 
40 ul of the totiT amount was subjected to ligation reaction with T4 ONA llgase at 12' C for 2 days. In this 
connection, the cosmid pJBd had been previously purchased from Amersham. Restriction enzymes and 

35 enzymes used for genetic manipulation were purchased from Boehringer-Mannheim. Takara Shuzo or Wako 
Pure Chemical Industries, This DNA in which the ligation reaction had been thus perfonned was used for In 
vitro packaging witii a Qigapack Gold (manufactured by Stratagene. marketed from Funakoshi) to give a 
large amount of phage particles sufficient for construction of a cosmid library. The phage particles were 
Infected with Escherichia coli DH1 (ATCC 33849). After the cells of E. col[ DHI infected were diluted so as 

40 to be 100 colonies per plate, they were plated on a LB plate, cultured at 37'C overnight and further at 
30 'C for 6 hours or more. As a result. E. coli transformants producing yellow pigments appeared in a 
proportion of one colony per about 1,100 colonies. These E. coli transformants producing yellow pigments 
contained plasmids in which 33 - 47 kb Sau3AI partial digestion fragments were inserted into tiie pJB8. 

45 

(3) Location of a yellow pigment-synthesizing gene cluster 

A yellow pigment-synthesizing gene cluster was Inserted into the pJ68 as the 33 - 47 kb Sau3A! partial 
digestion fragments. One of tiiese fragments was further subjected to partial digestion with Sau3AI, ligated 

50 to the BamHI site of the E. coli vector pUC19 (purchased from Takara Shuzo). and used to transform 
Escherichia coli JM109 (manufactured by Takara Shuzo). To locate the yellow pigment-syntiiesizing gene 
cluster, plasmid DNA's were prepared from 50 E. coli transformants producing yellow pigments which 
appeared in tiie LB plate containing amplcillin, and analyzed by agarose gel electrophoresis. As a result. It 
was found that the smallest inserted fragment was of 8.2 kb. The plasmid containing this 8.2 kb fragment 

55 was named as pCARI and E. coli ,JM109 harboring tiiis plasmid was named as Escherichia coli JM109 
(pCARI). This strain produced the same yellow pigments as those of E. uredovora . The 8.2 kb fragment 
contained a Kpnl site in the neighbourhood of the terminal at tine lac promoter side and a Hind lll site in the 
neighbourhoodTat the opposite side. After the 8.2 kb fragment was subjected to double digestion with 
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KpnI/Hlndlll ( Hind i!! was partially digested; the 8.2 kb fragment had two HIndl ll sites), the Kpnl-Hindlll 
fragment (6.9 kb) was recovered from a gel and ligated to the Kpnl-HindH I site of pUCl8 (this hybrid 
plasmid was named as pCARIS). Upon the transformation of E. coli JIVI109, the E. coli transformant 
exhibited yellow and produced the same yellow pigments as those of E. uredovora . Accordingly, ft was 
5 found out that the genes required for the yellow pigment production was located on the Kpnl- Hind lll 
fragment (6.9 kb). That is to say. the fragment carrying the yellow pigment-synthesizing genes was capable 
of being reduced to a 6.9 kb in size. 



w Experimental Example 2: Analysis of the yellow pigment-synthesizing gene cluster 



(1) Determination of the nucleotide sequence of the yellow pigment-synthesizing gene cluster 

IS The complete nucleotide sequence of the 6.9 kb Kpnl- Hind lll fragment was determined by the kilo- 
sequence method using Deletion kit for kilo-sequence (manufactured by Takara Shuzo) and the dideoxy 
method according to Proc. Natl. Acad. Sci. USA, 74 5463-5467 (1977). As a result, it was found that the 
Kpnl -Hindl ll fragment containing the yellow pigment-synthesizing genes (DNA strand) was 6918 base pairs 
(bp) in length and its GC content was 54%. The complete nucleotide sequence was shown in Rg. 7 (a) - 

20 (g). The Kpn i site is represented by the base number 1 . 

(2) Elucidation of yellow pigment-synthesizing gene cluster 

25 The Hind i!! side of the 6918 bp fragment (DNA strand) containing the yellow pigment-synthesizing 
genes (right terminal side in Fig. 7) was deleted with Deletion kit for kilo-sequence. A hybrid plasmid 
(designated pCAR2S) was constructed by inserting a 1 - 6503 fragment, which was obtained by deletion 
from the Hindlll site to nucleotide position 6504, Into pUC19. E. coli JM109 harboring pCAR25 (referred to 
hereinafter as E. coli (pCAR25)] exhibited yellow and produced the same yellow pigments as those of E. 

30 uredovora . TheTefore. it was thought that the region from the base number 6504 to 6918 in Fig. 7 was not 
required for yellow pigment production. The nucleotide sequence in the region from the base number 1 to 
6503 In the 6918 bp DNA sequence containing the yellow pigment-synthesizing genes was analyzed. As a 
result it was found that there were six open reading frames (ORFs). That is to say, there were an ORF 
coding for a polypeptide with a molecular weight of 32,583 from the base number 225 to 1130 (referred to 

35 as 0RF1. which corresponds to A • B in Rgs. 1 and 7). an ORF coding for a polypeptide with a molecular 
weight of 47,241 from the base number 1143 - 2435 (referred to as 0RF2. which corresponds to C - D in 
Figs. 2 and 7), an ORF coding for a polypeptide with a molecular weight of 43,047 from the base number 
2422 to 3567 (referred to as 0RF3, which con-esponds to E - F in Figs. 3 and 7), an ORF coding for a 
polypeptide with a molecular weight of 55.007 from the base number 3582 to 5057 (referred to as 0RF4. 

40 which corresponds to Q - H fn Figs. 4 and 7), an ORF coding for a polypeptide with a molecular weight of 
33,050 from the base number 5096 to 5983 (referred to as 0RF5, which corresponds to I - J in Figs. 5 and 
7). and an ORF coding for a polypeptide with a molecular weight of 19,816 from the base number 6452 to 
5928 (referred to as 0RF6, which con-esponds to K - L in Figs. 6 and 7. Only this 0RF6 has the opposite 
orientation with the others). In this connection, each ORF contained at positions several base-upstream from 

45 its initiation codon the SD (Shine-Oalgarno) sequence which is homologous with the 3'-region of 16S 
rifaosomal RNA of E. coli. Thus, it was thought that polypeptides were in fact synthesized in E. coll by these 
six ORFs. This wasconfimned by the following in vitro transcription-translation experiment. 

That Is to say. the In vitro transcription-translation analysis was carried out with DNA in which the 
plasmid pGAR25 containing ORFl - 0RF6 had been digested with Seal and with DNAs in which respective 

50 fragments containing respective ORFs (containing the SD sequence) of ORFl - 0RF6 had been digested 
with appropriate restriction enzymes, isolated, inserted into pUCl9 or pUC18 so that it was subjected to 
transcriptional read-through from a lac promoter, and then digested with Seal. In this experiment, a 
Prokaryotic DNA-directed translation kit manufactured by Amersham was used. As a result, it was confirmed 
that the bands of polypeptides corresponding to the aforementioned respective ORFs were detected as the 

55 transcription-translation products. 

Moreover, all of six ORFs were necessary for production of the same yellow pigments as those of E. 
uredovora as described below (Experimental Examples 3. 4 and 5). From these results. ORFl. 0RF2. 
0RF3, 0RF4. 0RF5 and 0RF6 were designated as zexA. zexB. zexC. zexD, zexE and zexF genes. 
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respectively. 

The base numbers in Figs. 1 - 6 were represented on the basis of the Kpnl site in Rg, 7 as the base 
number 1 and correspond to each other. The marks A • L in Rgs. 1 - 6 correspond to the marks A - L In 
Rg. 7. The ONA sequence from K to L in Rg. 6 was that of the complementary strand of the DNA 
5 sequence from K to L in Rg. 7. That is to say, the DNA sequence illustrated in Rg. 6 has the opposite 
orientation in transcription with the ONA sequences in Rgs. 1 - 5 in the original DNA sequence {Rg. 7). 



(3) Analysis of homology by the DNA-DNA hybridization method 

10 

Total DNA of Erwinla herblcola Eho 10 ATCC 39368 was prepared in the same manner as in 
Experimental Example 1 (1). A 7.6 kb fragment containing the ONA sequence in Rg. 7 was cut out from the 
hybrid piasmid pCAR1 by Kpnl digestion and labeled with DNA labeling & detection kit nonradioactive 
(manufactured by Boehringer-Mannheim) according to the DIQ-EUSA method to give probe ONA. The 

15 homology of total ONAs (intact or Kpnl digested) of E herbicola Eho 10 ATCC 39368 and E. uredovora 
20D3 ATCC 19321 with this probe DNA was analyzed by the DNA-DNA hybridization method with the 
aforementioned DNA labeling & detection kit nonradioactive. As a result, the probe DNA was hybridized 
strongly with total DNA of the latter E. uredovora 2003 ATCC 19321. but not at all with total DNA of the 
former E. herbicola Eho 10 ATCC 39368. Also, the restriction map deduced from the DNA sequence in Rg. 

20 7 was quite different from that reported in J. Bacterid., 168. 607-612 (1986). It was concluded from the 
above described results that the DNA sequence in Rg. 7. that is. the ONA sequences useful for the 
synthesis of carotenolds according to the present Invention exhibits no homology with the DNA sequence 
containing the yellow pigment-synthesizing genes of E. herbicola Eho 10 ATCC 39368. 

25 

Experimental Example 3: Analysis of yellow pigments 

E coli (pCAR25) produced the same yellow pigments as those of E uredovora 20D3 ATCC 19321 and 
E. herblcola Eho 10 ATCC 39368» and its yield was 5 times higher than those of the former and 6 times 

30 higher than those of the latter (per dry weight). The cells harvested from 8 liters of 2 x YT medium (1.6% 
tryptone, 1% yeast extract. 0.5% NaCI) were extracted once with 1.2 liter of methanol. The methanol extract 
was evaporated to dryness, dissolved In methanol, and subjected to thin layer chromatography (TLC) with 
silica gel 80 (Merck) (developed with chloroform : methanol * 4:1). The yellow pigments were separated 
into 3 spots having Rf values of 0.93, 0.62 and 0.30 by TLC. The yellow (to orange) pigment at the Rf value 

35 of 0.30 which was the strongest spot was scraped up from the TLC plate, extracted with a small amount of 
methanol, loaded on a Sephadex LH-20 column for chromatography [30 cm x 3.0 cm (0)] and developed 
and eluted with methanol to give 4 mg of a pure product. The yellow (to orange) pigment obtained was 
sparingly soluble In organic solvents other than methanol and easily soluble in water, so that it was 
suggested that it might be a carotenoid glycoside. Such suggestion was also supported from a molecular 

40 weight of 892 by FD-MS spectrum (the mass of this pigment was larger than that of zeaxanthin (described 
hereinafter) by the mass of two glucose). When this substance was hydrolyzed with IN HCI at 100* C for 10 
minutes, zeaxanthin was obtained. Then, acetylation was conducted according to the usual method. That is, 
the substance was dissolved in 10 ml of pyridine, large excess of acetic anhydride was added, and the 
mixture was stirred at room temperature and left standing ovemlght. After the completion of reaction, water 

45 was added to the mixture and chloroform extraction was carried out. The chloroform extract was con- 
centrated and loaded on a silica gel column (30 cm x 3.0 cm (0)] for chromatography to develop and elute 
with chloroform. Measurement of ^H-NMR gave the spectrum identical with the tetraacetyi derivative of 
zeaxanthin-j8-diglucoside [Helvetica Chimica Acta. 57, 164M 651 (1974)], so that the substance was 
Identified as zeaxanthin-)3-dlglucoside (Its structure being Illustrated below). 

50 The yield was 1.1 mg/g dry weight The substance had a solubility of at least 2 mg in 100 mi of water 
and methanol, and water was superior to methanol in solubility of the substance. The substance had low 
solubilities in chloroform and acetone, and its solubilities were 0.5 mg in 100 ml of these solvents. 



55 
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15 

Experimental Example 4: Analysis of the metabolic intermediates of carotenolds 



(1) Construction of various deletion plasmids 

20 

A hybrid plasmid (designated as pCAR16) was constnjcted by Inserting a 1-6009 fragment, which was 
obtained by deletion to nucleotide position 6010 from the Hind lii site (right temninal in Rg. 7) of the 6918 bp 
fragment containing the yellow pigment-synthesizing genes (DNA strand) (Rg. 7) using Deletion kit for kilo- 
sequence. pCARl6 contains the genes from zexA to zexE. Various deletion plasmids were constructed, as 
25 shown in Table 1, on the basis of the pCAR16 and the aforementioned hybrid plasmid pCAR25 (containing 
genes from zexA to zexF). 



Table 1: Construction of various deletion plasmids 

30 ' 

The number within parentheses behind the name of respective restriction enzymes represents the 
number of base at the initial recognition site of the restriction enzyme. The base numbers correspond to 
those in Figs. 1 - 6 and Rg. 7, Analysis of the metabolic intermediates of carotenolds was performed using 
the transfonnants of E, coli JM109 by various deletion plasmids [refen'ed to hereinafter as E. coll (name of 
35 plasmid)]. 



40 
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Table 1 





Plasmid 


Construction method 


Genes functioning 


5 


pCAR25 


See text 


zexA zexB zexC 








zexD zexE zexF 




pCAR25delB 


Frame shift in BstEII (1235) of 


zexA zexC zexD 






pCAR25 


zexE zexF 


10 


pCAR16 


Seete>ct 


zexA zexB zexC 








zexD zexE 




pCARlodeiD 


rrame snitt in tsstcii \i^oo} ot 


«A«#A mA\*^ 

zexA zexi/ zexu 






pCARie 


zexE 


IS 


pCAR16ddiC 


Frame shift In SnaBI (3497) of 


zexA zexB zexO 






pUAn 1 D 


zexE 




pCAR-ADE 


Deletion of the BstEII (1235) - 


zexA zexD zexE 


20 




onabi (o49/) Tragmeni irom 
pCARie 




pCAR-ADEF 


Deletion of the BstEII (1235) - 


zexA zexO zexE 






SnaBI (3497) fragment from 
DCAR25 


zexF 


25 


pCAR25delO 


Frame shift in BamHI (3652) of 


zexA zexB zexC 




DCAR25 


zexE zexF 




pCAR-AE 


Deletion of the BstEII (1235) - 


zexA zexE 






RpmHI franmsnt fmm 
pCAR16 




30 


nPAR-A 


inQprtinn nf th» KnnI /I) • BstEII 

(1235) fragmentirpUC19~ 


zexA 




pCAR-E 


Insertion of the Eco52l (4926) - 
6009 fragment In pUC19 


zexE 


35 


pCAR25delE 


Frame shift in Mlul (5379) of 


zexA zexB zexC 






pCAR25 


zexD zexF 




•pCAR25delA 


Frame shift In Aval (995) of pCAR25 


zexB zexC zexD 








zexE zexF 


40 


pCAR-CDE 


Insertion of the Sail (2295) - 6009 


zexC zexD zexE 






fragment in pUCl9 





(2) Identification of zeaxanthin 

The cells harvested from 3 liters of 2 x YT medium of E. coH {pCAR25delB) (exhibiting orange) were 
50 extracted twice with 400 ml portions of acetone at low temperature, concentrated, then extracted with 
chloroform:methanol (9:1) and evaporated to dryness. This was subjected to silica gel column chromatog- 
raphy [30 cm X 3.0 cm (0)]. After the column was washed with chlorofonm. an orange band was eluted with 
chloroform:methanol (100:1). This pigment was dissolved in ethanol. recrystalllzed at low temperature to 
give 8 mg of a pure product. The analysis by its UV-vlsible absorption, ^H-NMR, ^^c-NMR and FD-MS (m/e 
55 568) spectra revealed that this substance had the same structure except for stereochemistry as zeaxanthin 
()?./3-carotene-3.3'-diol). It was then dissolved In diethyl ether : Isopentane ethanol (5:5:2). and the CD 
spectrum was measured. As a result, it was found that this substance had a SR.s'R-stereochemistry 
[Phytochemistry, 27, 3605-3609 (1988)]. Therefore, it was identified as zeaxanthin (i3,^-carotene-3R.3 R- 
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diol). of which the structure is illustrated below. The yield was 22 mg/g dry weight. This substance 
corresponded to the yellow pigment having an Rf value of 0.93 in Experimental Example (1). 




(3) Identrficatfon of iS-carotene 

The cells harvested from 3 liters of LB medium of E. coli (pCAR16) (exhibiting orange) were extracted 3 
times with 500 ml portions of cold methanol at low temperature and the methanol extract was further 
extracted with 1.5 liter of hexane. The hexane layer was concentrated and subjected to silica gel column 
chromatography [30 cm x 3.0 cm (0)]. Development and elution were conducted with hexane:ethyl acetate 
(50:1) to collect an orange band. The orange fraction was concentrated and recrystallized from ethanol to 
give 8 mg (reduced weight without moisture). This substance was presumed to belong to i3-carotene from 
its UV-visible absorption spectrum, and a molecular weight of 536 by FD-MS spectrum also supported this 
presumption. Upon comparing this substance with the authentic sample (Sigma) of jS-carotene by '^c-NMR 
spectrum, ail of chemical shifts of carbons were Identical with each other. Thus, this substance was 
identified as j8-carotene (all-trans-/3,/9-carotene, of which the structure was illustrated below). It was also 
confirmed by the similar method that E coli (pCARIOdelB) accumulated the same /3-carotene as described 
above. Its yield was 2.0 mg/g dry weight, which con'esponded to 2 -8 times (per dry weight) of the total 
carotenoid yield in carrot (Kintokininjin) culture cells described in Soshikibalyou (The Tissue Culture). 13. 
379-382 (1987). 




(4) Identification of lycopene 

The cells harvested from 3 liters of LB medium of E. coli (pCAR16delC) (exhibiting red) were extracted 
once with 500 ml of cold methanol at low temperature, and the precipitate by centrifugation was extracted 
again with 1.5 liter of chloroform. The chloroform layer was concentrated and subjected to silica gel 
chromafejgraphy [30 cm x 3.0 cm (0)]. Development and elution were conducted with hexanexhioroform 
(1:1) to collect a red band. This fraction was concentrated. This substance was presumed to belong to 
lycopene from its UV-visible absorption spectrum, and a molecular weight of 536 by FD- MS spectrum also 
supported this presumption. Upon comparing this substance with the authentic sample (Sigma) of lycopene 
by ^H-NMR spectrum, all of chemical shifts of hydrogens were identical with each other. When, this 
substance and the authentic sample were subjected to TLC with silica gel 60 (Merck) [developed with 
hexane:chloroform (50:1)] and with RP-18 [developed with methanohchloroform (4:1)]. the displacement 
distances of these samples were completely equal to each other. Thus this substance was identified as 
lycopene (all- trans- jS.jS-carotene, of which the structure was Illustrated below). It was also confirmed by the 
similar method that E coli (pCAR-ADE) and E. coli (pCAR-AOEF) accumulated the same lycopene as 
described above. The yield of the fonmer was 2.0 mg/g dry weight, which corresponded to 2 times (per dry 
weight) of the total carotenoid yield in a hyperproduction derivative of cannot (KIntokininlin) culture cells 
described in Soshikibalyou (The Tissue Culture), 13. 379-382 (1987). 
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(5) Identification of phytoene 

The cells harvested from 1.5 liter of 2 x YT medium of E. coli (pCAR-AE) were extracted twice with 200 
ml portions of acetone and twice with 100 ml portions of hexane, and evaporated to dryness. This was 
subjected to siiica gel chromatography [30 cm x 3.0 cm (0)]. Development and elution were conducted 
with hexane:chloroform (1:1) to collect a band which had a strong UV absorption, and it was confirmed to be 
phytoene by its UV absorption spectrum. It was further subjected to LH-20 column chromatography [30 cm 
X 3,0 cm (0)]. Development and elution were conducted with chloroformrmethanol (1:1) to give 4 mg of a 
pure product The comparison of the 'H-NMR spectrum of this substance with the 'H-NMR spectra of trans- 
and cis-phytoen (J. Magnetic Resonance, 10, 43-50 (1973)) showed this substance to be a mixture of the 
trans^nd cis-lsomers. Isomerization from cis-isomer to trans- isomer hardly occurs, and thus It was judged 
thaTsuch amixture was produced as a result of trans- cis isomerization in the course of the purification. 
Therefore. It was concluded that the original phytoene was the trans- type phytoene (all-trans-phytoene, of 
which the structure was shown below). It was also confirmed by the similar method that E coli - 
(pCAR25delD) accumulated the same phytoene as described above. 




Experimental Example 5: Identification of carotenold biosynthesis genes 

From the facts that E. coli (pCAR25) produced zeaxanthln-diglucoside and that 6. coli (pCAR25delB) 
harboring a plasmid, in whicTT zexB had been removed from pCAR25, accumulated zeaxanthin. it was found 
that the zexB gene encoded the glycosylation enzyme which was capable of converting zeaxanthin into 
zeaxanthin-diglucoslde. Similarly, from the fact that E coli (pCARISdelB) harboring a plasmid, In which 
zexF had been removed from pCAR25deib, accumulated ^-carotene, it was found that the zexF gene 
encoded the hydroxylation enzyme which was capable of converting )3-carotene into zeaxanthin. Similarly, 
from the fact that the E. coli (pCAR-ADE) harboring a plasmid, in which zexC had been removed from 
pCAR16delB. accumulatidnycopene, it was found that the zexC gene encoded the cycllzation enzyme 
which was capable of converting lycopene Into /3-carotene. Also, E. coll (pCAR-ADEF) carrying both of the 
zexA. zexD and zexE genes required for producing lycopene and the zexF gene encoding the hydroxylation 
inzymewas abirtosynthesize only lycopene. This demonstrates directly that the hydroxylation reaction in 
carotenold biosynthesis occurs after the cyclization reaction. Further, from the facts that E col[ (pCAR-ADE) 
accumulated lycopene and that E. coli (pCAR-AE) harboring a plasmid, in which the zexD gene had been 
removed from pCAR-ADE, accumulated phytoene, it was found that the zexD gene encoded the de- 
saturation enzyme which was capable of converting phytoene into lycopene. E coH (pCAR-A) and E coH • 
(pCAR-E) were not able to produce phytoene. It was thought from this result that both of the zexA and zexE 
genes were required for producing phytoene in E. coli. zexE and zexA were identified as the gene for the 
conversion of geranylgeranyl pyrophosphate Into prephytoene pyrophosphate and that for the conversion of 
prephytoene pyrophosphate into phytoene, by comparing their putative amino acid sequence with those of 
crtB and crtE gene products in a photo synthetic bacterium Rhodobacter capsuratus [Mol. Gen. Genet.. 
216! 254-268~(1988)]. From these analyses described above, all of the six zex genes have been identified 
and the biosynthetic pathway of carotenoids have also been clear. These results are listed in Fig. 8. 

E. coli (pCAR25delE) accumulated no detectable carotenold intermediate, while E coll (pCAR25delA) 
and "E. "coli (pCAR-CDE) were able to produce a small amount of carotenoids. That is to say, E coH - 
(pCAR25deiA) and £. coli (pCAR-COE) produced 4% of zeaxanthin-diglucoslde and 2% of i9-carotene as 
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compared with the E coli (pCAR25) and the E coli (pCAR16delb), respectively. This result suggests that 
the reaction from prephytoene pyrophosphate to phytoene may occur non-enzymaticaily notwithstanding 
the yield being trace. 

As described above, the detailed biosynthetic pathway of carotenoids including general and famous 
5 carotenoids such as lycopene, jS-carotene and zeaxanthin and water soluble carotenoid such as zeaxanthln- 
diglucoside were for the first time elucidatedt and the gene cluster useful for these biosynthesis was 
capable of being acquired for the first time. In this connection, lycopene, /S-carotene and zeaxanthin which 
were produced by the genes in the aforementioned Experimental Examples were stereochemically identical 
with those derived from higher plants [T-W. Goodwin: "Plant Pigments". Academic Press (1988)]. 
10 As for zeaxanthin-diglucoside, the isolation from a plant was only reported (Pure & Appl. Chem.. 47. 
121-128 (1976)]. but Its isolation from microorganisms has not been reported. 



Experimental Example 6: Synthesis of carotenoids in Zymomonas 

15 

Zymomonas mobilis is a facultative anaerobic ethanoi-producing bacterium. It has a higher ethanol 
producing rate than that of yeast (Saccharomyces cerevlslae) . so that it is preferable as a fuel alcohol- 
producing bacterium in future. Also, Zymomonas has a special metabolic pathway, Entner-Ooudoroff but not 
glycolytic pathway and cannot produce carotenoids. In order to add further values to this bacterium, the 

20 carotenoid biosynthesis genes were introduced into Zymomonas . 

The 7.6 kb fragment containing the ONA sequence shown in Fig. 7 was cut out from the hybrid plasmid 
pCARI by Kpnl digestion and treated with DNA polymerase I (Kienow enzyme). The fragment thus treated 
was ligated to the EcoRV site of the cloning vector pZA22 for Zymomonas [see Agric. Biol. Chem., 50. 
3201-3203 (1986) and Japanese Patent Laid-Open Publication No. 228278/87] to construct a hybrid plasmid 

25 pZACARI. Also, the 1 -6009 fragment in the DNA sequence In Fig. 7 was cut out from pCAR16 by 
KpnI/EcoRI digestion and treated with DNA polymerase I (Kienow enzyme). The fragment thus treated was 
ligated to the EcoRV site of pZA22 to construct a hybrid plasmid pZACARia. The orientation of the inserted 
fragments in these plasmids were opposite with the orientation of the Tc' gene on taking the orientation in 
Fig. 7 as the normal one. These plasmids were introduced into Z. mobilis NRRL B-14023 by conjugal 

30 transfer with the helper plasmid pRK2013 (ATCC 37159) and stabiy maintained in this strain. Z. mobilis 
NRRL B-14023 in which pZACARI and pZACAR16 had been introduced exhibited yellow, and "produced 
zeaxanthin-diglucoside in an amount of 0.28 mg/g dry weight and /S-carotene in an amount of 0.14 mg/g dry 
weight, respectively. Therefore, carotenoids were successfully produced In Zymomonas by the carotenoid 
biosynthesis genes according to the present invention. 

35 

Deposition of Microorganism 

Microorganism relating to the present invention is deposited at Fermentation Research Institute. Japan 
40 as follows: 



Microorganism 


Accession number 


Date of deposit 


Escherichia coli JM109 (pCARI) 


PERM BP 2377 


Apriin, 1989 



Claims 

1 . A DNA sequence encoding a polypeptide which has an enzymatic activity for converting prephytoene 
pyrophosphate into phytoene and whose amino acid sequence con'esponds substantially to tiie amino acid 
sequence from A to B shown in Rgs. 1 (a) and (b). 

2. A DNA sequence encoding a polypeptide which has an enzymatic activity for converting zeaxanthin 
into zeaxanthin-diglucoside and whose amino acid sequence corresponds substantially to the amino acid 
sequence from C to D shown in Rgs. 2 (a) and (b). 

3. A DNA Sequence encoding a polypeptide which has an enzymatic activity for converting lycopene 
into jS-carotene and whose amino acid sequence conresponds substantially to the amino acid sequence 
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from E to F shown In Figs. 3 (a) and (b). 

4. A ONA sequence encoding a polypeptide which has an enzymatic activity for converting phytoene 
into lycopene and whose amino acid sequence conresponds substantially to the amino acid sequence from 
G to H shown in Figs. 4 (a), (b) and (c). 
5 5. A DNA sequence 'encoding a polypeptide which has an enzymatic activity for converting geranyl- 
geranyl pyrophosphate into prephytoene pyrophosphate and whose amino acid sequence corresponds 
substantially to the amino acid sequence from I to J shown in Figs. 5 (a) and (b). 

6. A DNA sequence encoding a polypeptide which has an enzymatic activity for converting iS-carotene 
into zeaxanthin and whose amino acid sequence conresponds substantially to the amino acid sequence from 

10 K to L shown in Fig. 6. 

7. A process for producing a carotenoid compound which Is selected from the group consisting of 
prephytoene pyrophosphate, phytoene, lycopene, /3-carotene, zeaxanthin and zeaxanthln-diglucoside, which 
comprises transforming a host with at least one of the DNA sequences according to claims 1-6. and 
cultudng the transfbrmant 

75 
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230 240 250 260 270 280 

ATGACGGTCTCCGCAAAAAAACACGTTCATCTCACTCCCGATGCTGCGGAGCAGTTACTG 
.Ne tThrVa I CysA 1 aLysLysHi s Val Hi sLeuThr ArgAspAl aAl aGI uGl nLeuLeu 



290 300 310 320 330 340 

GCTGATATTGATCGACGCCTTGATCAGTTATTGCCCGTGGAGGGAGAACGGGATGTTGTG 
A 1 aAsp 1 1 eAspAr^ArsLeuAspGl nLeuLeuProVa 1 GI uGl yG 1 uAr^AspVa 1 Va I 



350 360 370 380 390 400 

GGTGCCGCGATGCGTGAAGGTGCGCTGGCACCGGGAAAACGTATTCGCCCCATGTTGCTG 
GlyAl aAl aNe tArgGl uGl yAl aLeuA 1 aProGl yLysAr^I 1 eAr^ProNe tLeuLeu 



410 420 430 440 450 460 

TTGCTGACCGCCCGCGATCTGGGTTGCGCTGTCAGCCATGACGGATTACTGGATTTGGCC 
LeuLeuThrAUArsAspLeuGlyCysAlaValSerHisAspGIyLeuLeaAspLeuAla 



470 480 490 500 510 520 

TGTGCGGTGGAAATGGTCCACGCGGCTTCGCTGATCCTTGACGATATGCCCTGCATGGAC 
Cys Al aValGlttNetVal Hi 8A1 aAl aSerLeuI 1 eLeuAspAspNe tProCysNe t Asp 



530 640 550 560 570 560 

GATGCGAAGCTGCGGCGCGGACGCCCTACCATTCATTCTCATTACGGAGAGCATGTGGCA 
As p A I aLysLeuArgArgG 1 yArgPr oThr 11 eH i s SerH i s Ty rG i yG 1 uH i s Va 1 A 1 a 



590 600 610 620 630 640 

ATACTGGCGGCGGTTGCCTTGCTGAGTAAAGCCTTTGGCGTAATTGCCGATGCAGATGGC 
1 1 eLeuA 1 aA I aVa 1 A I aLeuLeuSe rLy s A 1 aPheG 1 y Va 1 1 1 e A I aAsp A 1 aAs pG 1 y 



650 660 670 680 690 700 

CTCACGCCGCTGGCAAAAAATCGGGCGGTTTCTGAACTGTCAAACGCCATCGGCATGCAA 
LeuThrProLeuAl aLysAsnAr^Al aVal SerGl uLeuSerAsnAl al 1 eGI yMe tG 1 n 



710 720 730 740 750 760 

GGATTGGTTCAGGGTCAGTTCAAGGATCTGTCTGAAGGGGATAAGCCGCGCAGCGCTGAA 
GlyLeuValGlnGlyGlnPheLysAspLeuSerGluGlyAspLysProArgSerAUGlu 



770 780 790 800 810 820 

GCTATTTTGATGACGAATCACTTTAAAACCAGCACGCTGTTTTGTGCCTCCATGCAGATG 
Al al 1 eLeuNe tThrAsnHi sPbeLysThrSerThrLeuPheCys Al aSerMe tG 1 nMe i 



830 840 850 860 870 880 

GCCTCGATTGTTGCGAATGCCTCCAGCGAAGCGCGTGATTGCCTGCATCGTTTTTCACTT 
AlaSerlleValAUAsnAlaSerSerGluAlaAr^AspCysLeuHisArgPheSerLeu 
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890 900 910 920 930 940 

CATCTTGGTCAGGCATTTCAACTGCTGGACGATTTGACCGATGGCATGACCGACACCCGT 
AspLeuGl yGl nAl aPheGlnLeuLeuAspAspLeuThrAspGl yMe tThrAspThrGl y 



950 960 970 980 990 1000 

AAGGATAGCAATCAGGACGCCGGTAAATCGACGCTGGTCAATCTGTTAGGCCCGAGGGCG 
LysAspSerAsnGlnAspAlaGlyLysSerThrLeuValAsnLeuLeuGlyProAr^Ala 



1010 1020 1030 1040 lOSO 1060 

GTTGAAGAACGTCTGAOACAACATCTTCAGCTTGCCAGTGAGCATCTCTCTGCGGCCTGC 
Va 1 Gl TiG 1 uArgLeuAr^iG I nHi sLeuG I nLeuA I aSerGl uHi sLeuSerA 1 aA 1 aCys 



1070 1080 1090 1100 1110 1120 

CAACACGGGCACGCCACTCAACATTTTATTCAGGCCTGGTTTGACAAAAAACTCGCTGCC 
> Gl nHi sGI yHi sAl aThrGl nHi sPhe 1 1 eGl nAl aTrpPheAspLysLysLeuAI aA 1 a 



1130 
GTCAGTTAA 
Va 1 Ser*** 



B 
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1150 1160 1170 1180 1190 1200 

ATGAGCCATTTCGCGGCGATCGCACCCCCTTTTTACACCCATGTTCGCGCATTACAGAAT 
.Me tSerHisPheAl «A1 all eAl aProProPheTyrSerHi sVal Ar^Al aLeuGl nAsn 



1210 1220 1230 1240 1250 1260 

CTCGCTCAGGAACTGGTCGCGCGCGGTCATCGGGTGACCTTTATTCAGCAATACGATATT 
LeuAl aGlnGluLeuVal Al aArgGl /Hi sAr^ValThrPhe I leGl nGl nTyrAspI 1 e 



1270 1280 1290 1300 1310 1320 

AAACACTTGATCGATAGCGAAACCATTGGATTTCATTCCGTCGGGACAGACAGCCATCCC 
LysHi sLeuri eAspSerGl uThrl 1 eGl yPheHisSerValGl yThrAspSerHl sPro 



1330 1340 1350 1360 1370 1380 

CCCGGCGCGTTAACGCGCGTGCTACACCTGGCGGCTCATCCTCTGGGGCCGTCAATGCTG 
ProGlyAlaLeuThrArsY'alLeuHisLeuAlaAlaHisProLeuGlyProSerMetLeu 



1390 1400 1410 1420 1430 1440 

AAGCTCATCAATGAAATGGCGCGCACCACCGATATGCTGTGCCGCGAACTCCCCCAGGCA 
LysLeuIleAsnGluNetAlaArfThrThrAspNetLeuCysArsGIuLeuProGlnAla 



14S0 1460 1470 1480 1490 1500 

TTTAACGATCTGGCCGTCGATGGCGTCATTGTTGATCAAATGGAACCGGCAGGCGCGCTC 
PheAsnAspLeuAl aVal AspGl yVal 1 1 eVal AspGl nMe tGl uProAl aGl yAl aLeu 



1510 1520 1530 1540 1550 1560 

GTTGCTGAAGCACTGGGACTGCCGTTTATCTCTGTCGCCTGCGCGCTGCCTCTCAATCGT 
Va 1 A 1 aG 1 uA 1 aLeuG 1 yLeuProPhe 1 1 eSerVal A 1 aCys Al aLeuProLeuAs nAr; 



: 1570 1580 1590 1600 1610 1620 

GAACCGGATATGCCCCTGGCGGTTATGCCTTTCGAATACGGGACCAGCGACGCGGCTCGC 
G 1 uProAspMe tProLeuA 1 aVal Me tPr oPheGl uTyrGI yThrSerAspA 1 aA I aArg 



1630 1640 1650 1660 1670 1680 

GAACGTTATGCCGCCAGTGAAAAAATTTATGACTCGCTAATGCGTCGTCATGACCGTGTC 
G I uAr^Tyr A 1 aA 1 aSerG 1 uLys 1 1 eTyrAspTrpLeuNe t ArgArgH i s As p ArgVa 1 



1690 1700 1710 1720 1730 1740 

ATTGCCGAACACAGCCACAGAATGGGCTTAGCCCCCCGGCAAAAGCTTCACCAGTGTTTT 
IleAlaGluHisSerHisArsMetGlyLeuAlaProArgGlnLysLeuHlsGInCysPhe 



1760 1760 1770 1780 1790 1800 

TCGCCACTGGCGCAAATCAGCCAGCTTGTTCCTGAACTGGATTTTCCCCGCAAAGCGTTA 
SerProLeuAl aGl n 11 eSerGI nLeuVal ProGl uLeoAspPheProArgLys Al aLeu 
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1810 1820 1830 1840 1850 1860 

CCGGCTTGTTTTCATGCCGTCGGGCCTCTGCGCGAAACGCACGCACCGTCAACGTCTTCA 
ProAl aCysPheHl sAl aVal Gl yProLeuArgOl oThrHl sAl aProSerThrSerSer 



1870 1880 1890 1900 1910 1920 

TCCCGTTATTTTACATCCTCAGAAAAACCCCGGATTTTCGCCTCGCTGGGCACGCTTCAG 
SerArgTyrPheThrSerSerGluLysProArrllcPheAlaSerLeuGlyThrLeuGln 



1930 1940 1950 1960 1970 1980 

GGACACCGTTATGGGCTGTTTAAAACGATAGTGAAAGCCTGTGAAGAAATTGACGGTCAG 
G 1 yH I s Ar^Ty rGl yLeuPheLysThr 1 1 eVa 1 Lys A 1 aCy sG 1 uG 1 u H eAspGl yG 1 n 



1990 2000 2010 2020 2030 2040 

CTCCTQTTAGCCCACTGTGGTCGTCTTACGGACTCTCAGTGTGAAGAGCTGGCGCGAAGC 
LeuLeuLeuAl aHl sCysGlyAr«LeuThrAspSer01 nCysGl uGl uLeuAl aArgSer 



2050- 2060 2070 2080 2090 2100 

CGTCATACACAGGTG6TGGATTTTGCCGATCAGTCA0CCGCCCTGTCTCAGGCGCAGCTG 
ArgHi sThrGl nVal Val AspPheAI aAspGl nSerAl aAl aLeuSerGl nAl aGl nLeu 



2110 2120 2130 2140 2150 2160 

GCGATCACCCACGGCGGCATGAATACGGTACTGGACGCGATTAATTACCGGACGCCCCTT 
Al al 1 cThrHi sGl yGl yMe k AsnThrVal LeuAspAl al 1 eAsnTyrArgThrPtoLeu 



2170 2180 2190 2200 2210 2220 

TTAGCGCTTCCGCTGGCCTTTGATCAGCCCGGCGTCGCGTCACGCATCGTTTATCACGGC 
LeuAlaLeuProLeuAlaPheAspGlnProGlyValAlaSarArglleValTyrHlsGly 



2230 2240 2250 2260 2270 2280 

ATCGGCAAGCGTCCTTCCCGCTTTACCACCAGCCATGCTTTGGCTCGTCAGATGCGTTCA 
1 1 eGl yLysArgAl aSer ArgPheThrThrSerHi sAl aLeuAl aAr«Gl nMe t Ar«Ser 



2290 2300 2310 2320 2330 2340 

TTGCTGACCAACGTCGACTTTCAGCAGCGCATGGCCAAAATCCAGACAGCCCTTCGTTTG 
LeuLcuThr AsnVal AspPheGl nGl nArgMe t Al aLys 1 1 eGl nThr Al aLeuArgLeu 



2350 2360 2370 2380 2390 2400 

GCAGGGGGCACCATGGCCGCTGCCGATATCATTGAGCAGGTTATGTGCACCGGTCAGCCT 
A 1 aG 1 yG I yThrMe t Al aA I aA 1 aAsp II e 1 1 eG I uG 1 nVa 1 Me ICysThrG 1 yG 1 nPro 

D 

2410 2420 2430 i 

GTCTTAAOTGGGAGCGGCTATGCAACCGCATTATGA 
Val LeuSerGl ySerGl yTyr Al aThr Al aLeu*** 
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2430 2440 2450 2460 2470 2480 

ATGCAACCCCATTATGATCTGATTCTCGTGCGGGCTGGACTCGCGAATGGCCTTATCGCC 
.Me tG 1 nPr oH i sTyr AspLeu 1 1 eLeuVa I Gl yA 1 aG 1 yLeuA ) aAsnG 1 y Leu 1 1 e A 1 a 



2490 2500 2510 2520 2530 2540 

CTGCGTCTTCAGCAGCAGCAACCTGATATGCGTATTTTGCTTATCGACGCCGCACCCCAG 
LeuArgLeuGlnGl nCl nGl nProAspMetAr^I 1 eLeuLeuI 1 eAspAl aAl aProGl n 



.2550 2560 2570 2580 2590 2600 

GCGGGCGGGAATCATACGTGGTCATTTCACCACGATGATTTGACTGAGAGCCAACATCGT 
Al «G1 yCl yAsnHi sThrTrpSerPheHi sHi sAspAspLeuThrGluSerGl nHi sArg 



2610 2620 2630 2640 2650 2660 

TGGATAGCTCCGCTGGTGGTTCATCACTGGCCCGACTATCAGGTACGCTTTCCCACACGC 
TrpIleAlaProLeuV«lValHisHlsTrpProA8pTyrainValAr«PheProThrAr« 



2670 2680 2690 2700 2710 2720 

CGTCGTAAGCTGAACAGCGGCTACTTTTGTATTACTTCTCAGCGTTTCGCTGAGGTTTTA 
ArgArgLysLeuAsnSerGl yTyrPheCys 1 1 eThrSerGl nArgPheAl aGluVal Leu 



2730 2740 2750 2760 2770 2780 

CAGCGACAGTTTGGCCCGCACTTGTGGATGGATACCGCGGTCGCAGAGGTTAATGCGGAA 
GlnArfGlnPheGlyProHisLeuTrpMetAspThrAlaValAlaGluValAsnAlaGlu 

2790 2800 2810 '2820 2830 2840 

TCTGTTCGGTTGAAAAAGGGTCAGGTTATCGGTGCCCGCGCGGTGATTGACGGGCGGGGT 
SerVal ArgLeuLysLysGl yGl nVal 1 1 eGl yAl aAr^Al aVal 1 1 eAspGl yArgGl y 



2850 2860 2870 2880 2890 2900 

TATGCGGCAAATTCAGCACTCAGCGTQGGCTTCCAGGCGTTTATTGGCCAGGAATGGCGA 
TyrAlaAlaAsnSerAlaLeuSerValGlyPheGlnAlaPhelleGlyGlnGluTrpArg 



2910 2920 2930 2940 2950 2960 

TTGAGCCACCCGCATGGTTTATCGTCTCCCATTATCATGGATGCCACGGTCGATCAGCAA 
LeuSerHi sProHi sGl yLeuSerSerProI 1 ell eHe tAspAl aThrVal AspGl nGl n 



2970 2980 2990 3000 3010 3020 

AATGGTTATCGCTTCGTGTACAGCCTGCCGCTCTCGCCGACCAGATTGTTAATTGAAGAC 
AsnGlyTyrArgPheValTyrSerLeuProLeuSerProThrArgLeuLeuIleGlttAsp 



3030 3040 3050 3060 3070 3080 

ACGCACTATATTGATAATGCGACATTAGATCCTGAATGCGCGCGGCAAAATATTTGCGAC 
ThrHl sTyr 1 1 eAspAsnAl aThrLeuAspProGl uCys A 1 aArgGl nAs n 11 eCysAs p 



FIG. 3 (a) 
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■3090 3100 3110 3120 3130 3140 

TATGCCGCGCAACAGGGTTGCCAGCTTCAGACACTGCTGCGAGAAGAACAGGGCGCCTTA 
TyrAl aAUGl nGl nGl yTrpGl nLeuGl nThrLeuLeuAr^G 1 uGl uGI nG 1 yAl aLeu 



31S0 3160 3170 3180 3190 3200 

CCCATTACTCTGTCGGGCAATGCCGACGCATTCTGGCAGCAGCGCCCCCTGGCCTGTAGT 
Pro 1 1 eThrLeuSerGl yAsnAl aAspAl aPheTrpGl nGl nArsProLeuAI aCysSer 



3210 3220 3230 3240 3250 3260 

GGATTACGTGCCGGTCTGTTCCATCCTACCACCGGCTATTCACTGCCGCTGGCGGTTGCC 
GlyLeuArgAlaGlyLeuPheHisProThrThrGlyTyrSerLeuProLeuA.laValAla 



3270 3280 3290 3300 3310 3320 

GTGGCCGACCGCCTGAGTGCACTTGATGTCTTTACGTCGGCCTCAATTCACCATGCCATT 
Val Al aAspArsLeaSer Al aLeuAspVa t PheThrSer A 1 aSer 11 eHI sH I s A 1 a II e 



3330 3340 33S0 3360 3370 3380 

ACGCATTTTGCCCGCGAGCGCTGGCAGCAGCAGGGCTTTTTCCGCATGCTGAATCGCATG 
ThrH i sPheA 1 aArsG I uArgTrpGl nG 1 nG 1 nG 1 yPhePhe Ar^Me tLeuAs nArgMe t 



3390 3400 3410 3420 3430 3440 

CTGTTTTTAGCCGGACCCGCCGATTCACGCTGGCGGGTTATGCAGCGTTTTTATGGTTTA 
LeuPheLeuA 1 aG 1 yProAl aAspSerAr^TrpArsVa 1 Me t G 1 n ArgPheTyrGl yLeu 



34S0 3460 3470 3480 3490 3500 

CCTGAAGATTTAATTGCCCGTTTTTATGCGGGAAAACTCACGCTGACCGATCGGCTACGT 
ProG 1 uAspLeuI 1 eAl aArsPheTyr Al aGl yLysLeuThrLeuThr AspArgLeuArg 



3510 3520 3530 3540 3550 3560 

ATTCTGAGCGGCAAGCCGCCTGTTCCGGTATTAGCAGCATTGCAAGCCATTATGACGACT 
IleLeuSerGlyLysProProValProValLeuAlaAlaLeuGlnAlal leNetThrThr 



FIG. 3 (b) 
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F 



EP 0 393 690 A1 



3590 3600 3610 3620 3630 3640 

ATGAAACCAACTACGGTAATTGGTGCAGGCTTCGCTGGCCTGGCACTGGCAATTCGTCTA 
|Ne tLysPr oThrThrVa 1 1 1 eG 1 /A I aGI yPixeGl yG 1 yLeuA 1 aLeuA 1 a 11 eArgLeu 

G 3650 3660 3670 3680 3690 3700 

CAAGCTGCGGGGATCCCCGTCTTACTGCTTQAACAACGTGATAAACCCGGCGGTCGGGCT 
Gl nAl aA 1 aGI y 1 1 ePro Val LeuLeuLeuGI uGl nAr^AspLysProGl yG 1 yArgA 1 a 



3710 3720 3730 3740 3750 3760 

TATGTCTACGAGGATCAGGGGTTTACCTTTGATGCAGGCCCGACGGTTATCACCGATCCC 
TyrValTyrGluAspGInGlyPheThrPheAspAlaGlyProThrVallleThrAspPro 



3770 3780 3790 3800 3810 3820 

AGTGCCATTGAAGAACTGTTTCCACTGGCAGGAAAACAGTTAAAAGAGTATGTCGAACTG 
SerAl all eGl uGl uLeuPheAl aLeuAl aGI yLysGl nLeuLysGl uTyrVal G 1 uLeu 



3830 3840 3850 3860 3870 3880 

CTGCCGGTTACGCCGTTTTACCGCCTGTGTTGCGAGTCAGGGAAGGTCTTTAATTACGAT 
LeuProValThrProPheTyrArsLeuCysTrpGluSerGlyLysValPheAsnTyrAsp 



3890 3900 3910 3920 3930 3940 

AACGATCAAACCCGGCTCGAAGCGCAGATTCAGCAGTTTAATCCCCGCGATGTCGAAGGT 
AsnAspGlnThrArfLettGluAIaGlnlleGlnGlnPheAsnProArgAspValGluGly 



3950 3960 3970 3980 3990 4000 

TATCGTCAGTTTCTGGACTATTCACGCGCGGTGTTTAAAGAAGGCTATCTAAAGCTCGGT 
Ty rArgG 1 nPheLeuAspTyrSer ArsA 1 aVa 1 PheLysGl uG 1 yTyrLeuLysLeuG 1 y 



4010 4020 4030 4040 4050 4060 

ACTGTCCCTTTTTTATCGTTCAGAGACATGCTTCGCGCCGCACCTCAACTGGCGAAACTG 
ThrValProPheLeuSerPheAr^AspNetLeuArfAlaAlaProGlnLeuAlaLysLeu 



4070 4080 4090 4100 4110 4120 

CAGGCATGGAGAAGCGTTTACAGTAAGGTTGCCAGTTACATCGAAGATGAACATCTGCGC 
GlnAlaTrpArsSerValTyrSerLysValAlaSerTyrl leGluAspGluHisLeuArg 



4130 4140 41S0 4160 4170 4180 

CAGGCGTTTTCTTTCCACTCGCTGTTGGTGGGCGGCAATCCCTTCGCCACCTCATCCATT 
GlnAlaPheSerPheHisSerLeuLeuValGlyGlyAsnProPheAlaThrSerSerl le 



4190 4200 4210 4220 4230 4240 

TATACGTTGATACACGCGCTGGAGCGTGAGTGGGGCGTCTGGTTTCCGCGTGGCGGCACC 
TyrThrLeuI leHisAlaLeuGluAr^luTrpGlyValTrpPheProArfGlyGlyThr 
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4250 4260 4270 4280 4290 4300 

GGCGCATTAGTTCAGGGGATGATAAAGCTGTTTCAGGATCTGGGTGGCGAAGTCGTGTTA 
GlyAlaLeuValGlnGlyMetlleLysLeuPheGInAspLeuGlyGlyGluValValLeu 



4310 4320 4330 4340 4350 4360 

AACGCCAGAGTCAGCCATATGGAAACGACAGGAAACAAGATTGAAGCCGTGCATTTAGAG 
AsnAlaAr^ValSerHisMetGluThrThrGIyAsnLysUeGluAlaValHisLeuGlu 



4370 4380 4390 4400 4410 4420 

GACGGTCGCAGGTTCCTGACGCAAGCCGTCGCGTCAAATGCAGATGTGGTTCATACCTAT 
AspGlyArsArgPheLeuThrGlnAlaValAUSerAsnAUAspValValHisThrTyr 



4430 4440 4450 4460 4470 4480 

CGCGACCTGTTAAGCCAGCACCCTGCCGCGGTTAAGCAGTCCAACAAACTGCAGACTAAG 
Ar^AspLeuLettSerGlnHisProAlaAlaValLysGlnSerAsnLysLeuGlnThrLys 



4490 4500 4510 4520 4530 4540 

CGCATGAGTAACTCTCTGTTTGTGCTCTATTTTGGTTTGAATCACCATCATGATCAGCTC 
ArgNetSerAsnSerLeuPheValLeuTyrPheGlyLeuAsnHisHisHisAspGlnLeu 



4550 4560 4570 4580 4590 4600 

GCGCATCACACGGTTTGTTTCGGCCCGCGTTACCGCGAGCTGATTGACGAAATTTTTAAT 
AlaHisHisThrValCysPheGtyProAr^TyrArgGluLeuIleAspGlttllePheAsn 



4610 4620 4630 4640 4650 4660 

CATGATGGCCTCGCAGAGGACTTCTCACTTTATCTGCACGCGCCCTGTGTCACGGATTCG 
Hi sAspGlyLeuAl aGl uAspPheSerLeuTyrLeuHi s Al aProCys Va 1 Thr AspSer 



4670 4680 4690 4700 4710 4720 

TCACTGGCGCCTGAAGGTTGCGGCAGTTACTATGTGTTGGCGCCGGTQCCGCATTTAGGC 
SerLeQAlaProGluGlyCysGIySerTyrTyrValLeuAIaProValProHisLeuGly 



4730 4740 4750 4760 4770 4780 

ACCGCGAACCTCGACTGGACGGTTGAGGGGCCAAAACTACGCGACCGTATTTTTGCGTAC 
ThrAlaAsnLeuAspTrpThrValGluGlyFroLysLeuArgAspArgI lePheAlaTyr 



4790 4800 4810 4820 4830 4840 

CTTGAGCAGCATTACATGCCTGGCTTACGGAGTCAGCTGGTCACGCACCGGATGTTTACG 
LeuGluGlnHlsTyrMetProGlyLeuArgSerGlnLeuValThrHisArgMetPheThr 



4850 4860 4870 4880 4890 4900 

CCGTTTGATTTTCGCGACCAGCTTAATGCCTATCATGGCTCAGCCTTTTCTGTGGAGCCC 
ProPheAspPheAr^AspGlnLeuAsnAlaTyrHisGlySerAlaPheSerValGluPro 
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4910 4920 4930 4940 4950 4960 

GTTCTTACCCAGAGCGCCTGGTTTCGGCCGCATAACCGCGATAAAACCATTACTAATCTC 
ValLeuThrGlnSerAlaTrpPheAr«ProHl6AsnArjfAspLysThrIleThrAsnLeu 



4970 4980 4990 SOOO SOlO 5020 

TACCTGGTCGGCGCAGGCACGCATCCCGGCGCAGGCATTCCTGGCGTCATCGGCTCCGCA 
TyrLeuVal Gl yAl aG lyThrHi sProGl yAl aGl y 1 1 eProGl y Va 1 11 eG I ySer A 1 a 



5030 5040 5050 5060 

AAAGCGACAGCAGGTTTGATGCTGGAGGATCTGATTTGA 
LysAl aThrAl aGl yLeuMe tLeuGl uAspLeuI 1 e^** 

H 
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5100 SllO 5120 5130 5140 5150 

ATGGCAGTTGGCTCGAAAAGTTTTGCGACAGCCTCAAAGTTATTTGATGCAAAAACCCGG 

jMetAlaValGIySerLysSerPhcAUThrAUSerLysLeuPheAspAULysThrArg 

^ 5160 6170 5180 5190 5200 5210 

CGCAGCGTACTGATGCTCTACGCCTGGTGCCGCCATTGTGACGATGTTATTGACGATCAG 
ArgScrValLeuMetLeuTyrAUTrpCysAr^HlsCysAspAspVallUAspAspGln 

5220 5230 5240 8250 5260 5270 

ACGCTGGGCTTTCAGGCCCGGCAGCCTGCCTTACAAACGCCCGAACAACGTCTGATGCAA 

Th rLcuGl yPhcGl n A 1 aAr«G I nPro Al aLeuG I nThrProG 1 uG 1 nAr«LeuMe tG 1 n 

5280 6290 8300 8310 5320 5330 

CTTGAGATGAAAACGCGCCAGGCCTATGCAGGATCGCAGATGCACGAACCGGCGTTTGCG 
LeuGluMetLysThrArgGlnAlaTyrAlaGlySerGlnMetHisGlttProAlaPheAla 

5340 5350 5360 5370 5380 5390 

GCTTTTCAGGAAGTGGCTATGGCTCATGATATCGCCCCGGCTTACGCGTTTGATCATCTG 
AlaPheGlnGluValAUMetAlaHisAspIleAlaProAlaTyrAlaPheAspHisLeu 

5400 5410 8420 5430 8440 5480 

GAAGGCTTCGCCATGGATGTACGCGAAGCGCAATACAGCCAACTGGATGATACGCTGCGC 
GluGlyPheAlaMetAspValAr«GluAlaGlnTyrSerGlnLeuAspAspThrLeuAr« 

5460 5470 5480 5490 5500 8510 

TATTGCTATCACGTTGCAGGCGTTGTCGGCTTGATGATGGCGCAAATCATGGGCGTGCGG 

TyrCysTyrHlsValAUGlyValValGlyLeuMetMetAlaGlnlleMetGlyValArg 

5520 5530 5540 6580 5560 5570 

GATAACGCCACGCTGGACCGCGCCTGTGACCTTGGGCTGGCATTTCAGTTGACCAATATT 

AspAsnAlaThrLeuAspArgAlaCysAspLeuGlyLeuAlaPheGlnLeuThrAsnlle 

5580 5590 5600 5610 5620 5630 

GCTCGCGATATTGTGGACGATGCGCATGCGGGCCGCTGTTATCTGCCGGCAAGCTGGCTG 
AlaAr«AspIleValAspAspAlaHisAlaGlyAr«CysTyrLeuProAlaSerTrpLeu 

5640 5650 5660 5670 5680 5690 

GAGCATGAAGGTCTGAACAAAGAGAATTATGCGGCACCTGAAAACCGTCAGGCGCTGACC 

GluHlsGluGlyLeuAsnLysGluAsnTyrAlaAlaProGluAsnArgGlnAlaLeuSer 

5700 5710 5720 5730 5740 5750 

CGTATCGCCCGTCGTTTGGTGCAGGAAGCAGAACCTTACTATTTGTCTGCCACAGCCGGC 

Ar« 1 1 eA 1 aArgAr«LeuVa I G 1 nG 1 uA I aG 1 uProTy rTy r LeuS e r A I aThr A 1 aG 1 y 
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5760 5770 5780 5790 S800 5810 

CTGGCAGGGTTGCCCCTGCGTTCCGCCTGGGCAATCGCTACGGCGAAGCAGGTTTACCGG 
LeuAlaGlyLeuProLeuAr£SerAIaTrpAlaIl6AlaThrAlaLysGlnValT.yrAr£ 



5820 5830 5840 5850 5860 5870 

AAAATAGGTGTCAAAGTTGAACAGGCCGGTCAGCAAGCCTGGGATCAGCGGCAGTCAACG 
Lys 1 1 eGl y Val Lys Val G 1 uGl nA 1 aGl yG 1 nGl nA 1 aTrpAspG 1 nkrgGl nSerThr 



5880 5890 5900 5910 5920 5930 

ACCACGCCCGAAAAATTAACGCTGCTGCTGGCCGCCTCTGGTCAGGCCCTTACTTCCCGG 
ThrThrProGluLysLeuThrLeuLeuLeuAlaAlaSerGlyGlnAlaLeuThrSerArg 



5940 5950 5960 5970 5980 

ATGCGGGCTCATCCTCCCCGCCCTGCGCATCTCTGGCAGCGCCCGCTCTAG 
Me tArgAl aHi sProProArgProAl aHi sLeuTrpGl nArgPrbLeu.*** 



J 
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64S2 



ATGTTGTGGATTTGGAATGCCCTGATCGTTTTCGTTACCGTGATTGGCATGGAAGTGATT 
Me tLeuTrpI 1 eTrpAsnAl aLeuI I eVal PheValThrVal 1 1 eGl yMe tGl uVal 1 1 e 



GCTQCACTGGCACACAAATACATCATGCACGGCTGGGGTTGGGGATGGCATCTTTCACAT 
AUAlaLettAUHULysTyrlleMetHisGlyTrpGlyTrpGlyTrpHisLeuSerHIs 



CATGAACCGCGTAAAGGTGCGTTTGAAGTTAACGATCTTTATGCCGTGGTTtTTGCTGCA 
HiaGluProAriLysGlyAUPheGluValAsnAspLeuTyrAUValValPheAUAU 



TTATCGATCCTGCTGATTTATCTGGGCAGTACAGGAATGTGGCCGCTCCAGTGGATTGGC 
Leaser 1 1 eLeuLeuI 1 cTyrLeuGl ySerThrGl yMe ITrpProLeuG 1 nTrp 11 eGl y 



GCAGGTATGACCGCGTATGGATTACTCTATTTTATGQTGCACGACGGGCTGGTGCATCAA 
AUGlyMelThrAlaTyrGIyLeuLeuTyrPheHelValHlsAspGlyLeuValHisGln 



CGTTGGCCATTCCGCTATATTCCACGCAAGGGCTACCTCAAACGGTTGTATATGGCGCAC 
Ar«TrpProPheAr«TyrI 1 eProAriLysGlyTyrLeuLysAr^LeuTyrMe lAl aHl s 



CGTATGCATCACGCCGTCAGGGGCAAAGAAGGTTGTGTTTCTTTTGGCTTCCTCTATGCG 
ArgMetHlsHlsAlaValArgGlyLysGUGlyCysValSerPheGlyPheLeuTyrAla 



CCGCCCCTGTCAAAACTTCAGGCGACGCTCCGGGAAAGACATGGCGCTAGAGCGGGCGCT 
ProProLeuSerLysLeuGl nAl aThrLeuArgGl uArgHl sGl yA 1 aArgAl aG I yA I a 



S92S 

GCCAGAGATGCGCAGGGCGGGGAGGATGAGCCCGCATCCGGGAAGTAA 
AlaAr«AspAUGlnGIyGlyGluAspGluProAiaSerGlyLys«** 



L 
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1 10 20 30 40 50 

GGTACCGCAC GGTCTGCCAA TCCGACGGAG GTTTATGAAT TTTCCACCTT TTCCACAAGC 

70 80 90 100 110 

TCAACTAGTA TTAACGATGT GGATTTAGCA AAAAAAACCT GTAACCCTAA AT6TAAAATA 

130 140 150 160 170 

ACGGGTAAGC CTGCCAACCA TGTTATG6CA GATTAAGCGT CTTTTTGAAG GGCACCGCAT 

190 200 210 220 ^ 230 

CTTTCGC6TT GCCGTAAAT6 TATCCGTTTA TAAGGACAGC CCGAATGACG GTCTGCGCAA 

250 260 270 280 290 

AAAAACACGT TCATCTCACT C6C6AT6CTG CGGAGCAGTT ACTGGCTGAT ATTGATCGAC 

310 320 330 340 350 

GCCTTGATCA GTTATTGCCC GTGGAGGGAG AACGGGATGT TGTGGGTGCC GC6ATGCGTG 

370 380 390 400 410 

AAGGTGCGCT GGCACCGGGA AAACGTATTC GCCCCATGTT GCTGTTGCTG ACCGCCCGCG 

430 440 450 460 470 

ATCTGGGTTG C6CTGTCA6C CATGAC66AT TACTG6ATTT 6GCCTGTGCG GT6GAAATGG 

490 500 510 520 530 

TCCACGCGGC TTCGCTGATC CTTGACGATA T6CCCT6CAT GGACGATGCG AAGCTGCGGC 

550 560 570 580 590 

GCGGACGCCC TAGCATTCAT TCTCATTACG GAGAGCATGT GGCAATACTG GCGGCGGTTG 

610 620 630 640 650 

CCTTGCTGAG TAAAGCdTT GGCGTAATTG CCGATGCAGA TGGCCTCACG CCGCTGGCAA 

670 680 690 700 710 

AAAATCGGGC 6GTTTCTGAA CTGTCAAACG CCATCGGCAT GCAAGGATTG GTTCAGG6TC 

730 740 750 760 770 

AGTTCAAGGA TCTGTCTGAA GGGGATAAGC CGCGCAGCGC TGAAGCTATT TTGATGACGA 

790 800 810 820 830 

ATCACTTTAA AACCAGCACG CTGTTTT6T6 CCTCCAT6CA GATGGCCTCG ATTGTTGCGA 

850 860 870 880 890 

ATGCCTCCAG CGAAGCGCGT GATTGCCTGC ATCGTrPPTC ACTTGATCTT GGTCAGGCAT 

910 920 930 940 950 

TTCAACTGC7 GGACGATTTG ACCGATGGCA TGACCGACAC CGGTAAGGAT AGCAATCAGG 

970 980 990 1000 1010 

ACGCCGGTAA ATCGACGCTG GTCAATCTGT TAGGCCCGAG GGCGGTTGAA GAACGTCTGA 

1030 1040 1050 1060 1070 

GACAACATCT TCAGCTTGCC AGTGAGCATC TCPCTGC6GC CTGCCAACAC GGGCACGCCA 

1090 1100 1110 1120 1130? 

CTCAACATTT TATTCAGGCC TGGTTTGACA AAAAACTCGC TGCCGTCAGT 'tAAGGATGCT 
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Y 1150 1160 1170 1180 1190 

GCATGAGCCA. TTTCGCGGCG ATCGCACCGC CTTTTTACAG CCATGTTCGC GCATTACAGA 

1210 1220 1230 1240 1250 

ATCTCGCTCA GGAACTGGTC GCGCGCGGTC ATCGGGTGAC CTTTATTCAG CAATACGATA 

1270 1280 1290 1300 1310 

TTAAACACTT GAICGATAGC GAAACCATT6 GATTTCATTC CGTCGGGACA GACAGCCATC 

1330 1340 1350 1360 1370 

CCCCC6GC6C GTTAACGCGC GTGCTACACC TGGCGGCTCA TCCTCTGGGG CCGTCAATGC 

1390 1400 1410 1420 1430 

T6AAGCTCAT CAATGAAATG GCGCGCACCA CCGATATGCT GTGCCGCGAA CTCCCCCAGG 

1450 1460 1470 1480 1490 

CATTTAACGA TCTGGCC6TC GATGGCGTCA TTGTTGATCA AATGGAACCG GCAGGCGCGC 

1510 1520 1530 1540 1550 

TCGTTGCTGA AGCACTGGGA CTGCCGTTTA TCTCTGTCGC CTGCGCGCTG CCTCTCAATC 

1570 1580 1590 1600 1610 

6T6AACCGGA TATGCCCCT6 GCGGTTATGC CTTTCGAATA CGGGACCAGC GACGC6GCTC 

1630 1640. 1650 1660 1670 

G06AACGTTA TGCCGCCAGT GAAAAAATTT ATGACTGGCT AATGCGTCGT CATGACCGTG 

1690 1700 1710 1720 1730 

TCATTGCCGA ACACAGCCAC AGAAT6G6CT TAGCCCCCCG GCAAAAGCTT CACCAGTGTT 

1750 1760 1770 1780 1790 

TTTCGCCACT GGCGCAAATC AGCCAGCTTG TTCCTGAACT GGATTTTCCC CGCAAAGCGT 

1810 1820 1830 1840 18S0 

TACCGGCTTG TTTTCATGCC GTCGGGCCTC TGCGCGAAAC GCACGCACCG TCAACGTCTT 

1870 1880 1890 1900 1910 

CATCCC6TTA TTTTACATCC TCAGAAAAAC CCCGGATTTT CGCCTCGCTG GGCACGCTTC 

1930 1940 1950 1960 1970 

A6GGACACCG TTATG6GCTG TTTAAAACGA TAGTGAAA6C CTGTGAAGAA ATTGACGGTC 

1990 2000 2010 2020 2030 

AGCTCCT6TT AGCCCACTGT GGTCGTCTTA CGGACTCTCA GTGTGAA6AG CTGGCGCGAA 

2050 2060 2070 2080 2090 

GCCGTCATAC ACAGGTGGTG GATTTTGCCG ATCAGTCAGC CGCGCTGTCT CAGGCGCAGC 

2110 2120 2130 2140 2150 

.TGGCGATCAC CCACGGCGGC ATGAATACGG TACT6GACGC GATTAATTAC CGGACGCCCC 

2170 2180 2190 2200 2210 

TTTTAGCGCT TCCGCTGGCC TTTGATCAGC. CCGGCGTCGC GTCACGCATC GTTTATCACG 

2230 2240 2250 2260 2270 

GCATCG6CAA GCGTGCPTCC CGCTTTACCA CCAGCCATGC TTTGGCTCGT CAGATGCGTT 
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2290 2300 2310 2320 2330 

CATTGCTGAC CAACGTCGAC TTTCAGCAGC GCATGGCGAA AATCCAGACA GCCCTTCGTT 

2350 2360 2370 2380 2390 

TGGCAGG6GG CACC»TGGCC 6CT6CCGATA TCATTGAGCA GGTTAT6TGC ACC6GTCAGC 

2410 2420 f 2430 ^2440 2450 

CTGTCTTAAG T6GGAGCGGC TOTGCAACCG CATTATGATC TGATTCTCGT GGGGGCTGGA 

2470 2480 2490 2500 2510 

CTCGCGAATG GCCTTATCGC CCTGC6TCTT CA6CAGCAGC AACCTGATAT GCGTATTTTG 

2530 2540 2550 2560 2570 

CTTATCGACG CCGCACCCCA GGCGGGCGGG AATCATACGT GGTCATTTCA CCACGATGAT 

2590 2600 2610 2620 2630 

TTGACTGAGA GCCAACATCG TTGGATAGCT CCGCTGGTGG TTCATCy^CTG GCCCGACTAT 

2650 2660 2670 2680 2690 

CAGGTAC6CT TTCCCACACG CCGTC6TAA6 CTGAACA6CG GCTACTTTTG TATTACTTCT 

2710 2720 2730 2740 2750 

CAGCGTTTCG CTGAGGTTTT ACAGCGACAG TTTGGCCCGC ACTTGTGGAT 6GATACCGCG 

2770 2780 2790 2800 2810 

GTC6CA6AGG TTAATGC6GA A7CTGTTCGG tTGAAAAAGG GTCAGGTTAT CGGTGCCCGC 

2830 2840 2850 2860 2870 

GCGGTGATTG ACGGGCGGGG TTATGCGGCA AATTCAGCAC TGAGCGTGGG CTTCCAGGCG 

2890 2900 2910 2920 2930 

TTTATTGGCC AGGAATGGCG ATTGAGCCAC CCGCATGGTT TATCGTCTCC CATTATCATG 

2950 2960 2970 2980 2990 

GATGCCACGG TCGATCA6CA AAATGGTTAT CGCTTCGTGT ACAGCCTGCC GCTCTCGCCG 

3010 3020 3030 3040 3050 

ACCAGATTGT TAATTGAAGA CACGCACTAT ATTGATAATG CGACATTAGA TCCTGAATGC 

3070 3080 3090 3100 3110 

6CGCGGCAAA ATATTTGCGA CTATGCC6C6 CAACAGGGTT GGCAGCTTCA GACACTGCTG 

3130 3140 3150 3160 3170 

CGAGAAGAAC AGGGCGCCTT ACCCATTACT CTGTCGGGCA ATGCCGACGC ATTCTGGCAG 

3190 3200 3210 3220 3230 

CAGCGCCCCC TGGCCTGTAG T6GATTACGT GCCGGTCTGT TCCATCCTAC CACCGGCTAT 

3250 3260 3270 3280 3290 

TCACTGCCGC TGGCGGTTGC CGTGGCC6AC CGCCTGAGTG CACTTGAT6T CTTTACGTCG 

3310 3320 3330 3340 3350 

GCCTCAATTC ACCATGCCAT TACGCATTTT GCCCGCGA6C GCTGGCyVGCA GCAGGGCTTT 

3370 3380 3390 3400 3410 

TTCCGCATGC TGAATCGCAT GCTGTTTTTA GCCGGACCCG CCGATTCACG CTGGCGGGTT 
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3430 3440 3450 3460 3470 

ATGCAGCGTT TTTATGGTTT ACCTGAAGAT TTAATTGCCC GTTTTTATGC GGGAAAACTC 

3490 3500 3510 3520 3530- 

ACGCTGACCG ATCGGCTACG TATTCTGA6C GGCAAGCCGC CTGTTCCGGT ATTAGCAGCA 

3550 3560 3^70 3580 ? 3590 

TTGCAAGCCA TTATGACGAC TCATC6TTAA AGAGC6ACTA CXTGAAACCA ACTACGGTAA 

3610 3620 3630 3640 3650 

TTGGTGCAGG CTTCGGTGGC CTGGCACTGG CAATTCGTCT ACAAGCTGCG GGGATCCCCG 

3670 3680 3690 3700 3710 

TCTTACTGCT TGAACAACGT GATAAACCCG GCGGTCGGGC TTATGTCTAC GAGGATCAGG 

3730 3740 3750 3760 3770 

GGTTTACCTT TGATGCAGGC CCGACGGTTA TCACCGATCC CAGTGCCATT GAAGAACTGT 

3790 3800 3810 3820 3830 

TTGCACTGGC AGGAAAACAG TTAAAAGAGT ATGTCGAACT 6CTGCCGGTT ACGCCGTTTT 

3850 3860 3870 3880 3890 

ACCGCCTGTG TTGGGAGTCA GGGAAGGTCT TTAATTACGA TAACGATCAA ACCCGGCTCG 

3910 3920 3930 3940 3950 

AAGCGCAGAT TCAGCAGTTT AATCCCCGCG ATGTCGAAGG TTATCGTCAG TTTCTGGACT 

3970 3980 3990 4000 4010 

ATTCACGCGC GGTGTTTAAA GAAGGCTATC TAAAGCTCGG TACTGTCCCT TTTTTATCGT 

4030 4040 4050 4060 4070 

TCAGAGACAT GCTTCGCGCC GCACCTCAAC TGGCGAAACT GCAGGCATGG AGAAGCGTTT 

4090 4100 4110 4120 4130 

ACAGTAAGGT TGCCAGTTAC ATCGAAGATG AACATCTGCG CCAGGCGTTT TCTTTCCACT 

4150 4160 4170 4180 4190 

CGCTGTTGGT GGGCGGCAAT CCCTTCGCCA CCTCATCCAT TTATACGTTG ATACACGCGC 

4210 4220 4230 4240 4250 

TGGAGCGTGA GTGGGGCGTC TGGTTTCCGC GTGGCGGCAC CGGCGCATTA GTTCAGGGGA 

4270 4280 4290 4300 4310 

TGATAAAGCT GTTTCAGGAT CTGGGTGGCG AAGTCGTGTT AAACGCCAGA GTCAGCCATA 

4330 4340 4350 4360 4370 

TGGAAACGAC AGGAAACAAG ATTGAAGCCG TGCATTTAGA GGACGGTCGC AGGTTCCTGA 

4390 4400 4410 4420 4430 

CGCAAGCCGT CGCGTCAAAT GCAGATGTGG TTCATACCTA TCGCGACCTG TTAAGCCAGC 

4450 4460 4470 4480 4490 

ACCCTGCCGC GGTTAAGCAG TCCAACAAAC TGCAGACTAA GCGCATGAGT AACTCTCTGT 

4510 4520 4530 4540 4550 

TTGTGCTCTA TTTTGGTTTG AATCACCATC ATGATCAGCT CGCGCATCAC ACGGTTTGTT 
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4570 4580 4590 4600 4610 

TCGGCCCGCG TTACCGCGAG CTGATTGACG AAATTTTTAA TCAT6ATGGC CTCGCAGAGG 

4630 4640 4650 4660 4670 

ACTTCTCACT TTATCTGCAC GC6CCCTGTG TCACG6ATTC GTCACTGGCG CCTGAAGGTT 

4690 4700 4710 4720 4730 

GC6GCAGTTA CTATGTGTTG 6CGCCGGTGC C6CATTTA6G CACCGCGAAC CTCGACTGGA 

4750 4760 4770 4780 4790 

CGGTTGAGG6 GCCAAAACTA CGCGACCGTA TTTTTGCGTA CCTTGAGCAG CATTACATGC 

4810 4820 4830 4840 4850 

CTGGCTTACG GAGTCAGCTG GTCACGCACC GGATGTTTAC GCCGTTTGAT TTTCGCGACC 

4870 4880 4690 4900 4910 

AGCTTAATGC CTATCATGGC TCAGCCTTTT CTGTGGAGCC CGTTCTTACC CAGAGCGCCT 

4930 4940 4950 4960 4970 

GGTTTCGGCC GCATAACCGC GATAAAACCA TTACTAATCT CTACCTGGTC GGCGCAGGCA 

4990 5000 5010 5020 5030 

C6CATCCCG6 CGCAGGCATT CCTGGC6TCA TC66CTC6GC AAAAGC6ACA GCA6GTTTGA 

5050 $60 5070 5080 5090 j 

TGCTGGAGGA TCT6ATTTGA ATAATCCGTC GTTACTCAAT CATGC6GTCG AAACGATGGC 

5110 5120 5130 5140 5150 

AGTTGGCTCG AAAAGTTTTG CGACAGCCTC AAAGmTTT GATGCAAAAA CCCGGCGCAG 

5170 5180 5190 5200 5210 

CGTACTGATG CTCTACGCCT GGTGCCGCCA TTGTGACGAT GTTATTGACG ATCAGACGCT 

5230 5240 5250 5260 5270 

G6GCTTTCA6 GCCCGGCAGC CTGCCTTACA AACGCCCGAA CAACGTCTGA TGCAACTTGA 

5290 5300 5310 5320 5330 

GATGAAAACG CGCCAGGCCT ATGCAGGATC GCAGATGCAC GAACCGGCGT TTGCGGCTTT 

5350 5360 5370 5380 5390 

TCAGGAAGTG GCTAT6GCTC AT6ATATCGC CCCGGCTTAC GCGTTTGATC ATCTGGAAGG 

5410 5420 5430 5440 5450 

CTTCGCCATG GATGTACGCG AAGCGCAATA CAGCCAACTG GATGATACGC TGCGCTATTG 

5470 5480 5490 5500 5510 

CTATCACGTT GCAGGCGTTG TCGGCTTGAT GAT6GCGCAA ATCATGGGCG TGCGGGATAA 

5530 5540 5550 5560 5570 

CGCCACGCTG GACCGCGCCT GTGACCTTGG GCTGGCATTT CAGTTGACCyi ATATTGCTCG 

5590 5600 5610 5620 5630 

CGATATTGTG GACGATGC6C ATGCGGGCCG CTGTTATCTG CCGGC»AGCT GGCTGGAGCA 

5650 5660 5670 5680 5690 

TGAAGGTCTG AACAAAGAGA ATTATGCGGC ACCTGAAAAC cgtcaggcgc tgagccgtat 
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5710 5720 5730 5740 5750 

CGCCCGTCGT TTGGTGCAGG AAGCAGAACC TTACTATTTG TCTGCCACAG CCGGCCTGGC 

5770 5780 5790 5800 5810 

AGGGTTGCCC CTGCGTTCCG CCTGGGCAAT CGCTACGGCG AAGCAGGTTT ACCGGAAAAT 

5830 5840 5850 5860 5870 

AGGTGTCAAA GTTGAACAG6 CCG6TCAGCA A6CCT6G6AT CAGCGGCAGT CAACGACCAC 

5890 5900 5910 5920 S930 

GCCCGAAAAA TTAACGCT6C TGCTG6CCGC CTCTGGTCAG 6CCCTTACTT CCCGGATGCG 

5950 5960 5970 5980 j 5990 

GGCTCATCCT CCCCGCCCTG CGCATCTCTG GCAGCGCCCG CTCTAGCGCC ATGTCTTTCC 

6010 6020 6030 6040 6050 

CGGAGCGTCG CCTGAAGTTT TGACAGGGGC GGCGCATAGA GGAAGCCAAA AGAAACACAA 

6070 6080 6090 6100 6110 

CCTTCTTTGC CCCTGAC66C GTGATGCATA C66TGCGCCA TATACAACC6 TTT6AG6TAG 

6130 6140 6150 6160 6170 

CCCTTGC6TG GAATATAGCG 6AAT6GCCAA CGTTGAT6CA CCAGCCCGTC GTGCACCATA 

6190 6200 6210 6220 6230 

AAATAGA6TA ATCCATACGC C6TCATACCT GC6CCAATCC ACTGGA6C6G CCACATTCCT 

6250 6260 6270 6280 6290 

GTACTGCCCA GATAAATCAG CAGGATCGAT AATGCAGCAA AAACCACGGC ATAAAGATCG ' 

6310 6320 6330 6340 6350 

TTAACTTCAA ACGCACCTTT ACGCGGTTCA TGATGTGAAA GATGCCATCC CCAACCCCAG 

6370 6380 6390 6400 6410 

CCGTGCATGA TGTATTTGTG TGCCAGTGCA GCAATCACTT CCAT6CCAAT CACGGTAACG 

6430 6440 6450 6460 6470 

AAAACGATCA GGGCATTCCA AATCCACAAC ATAATTTCTC CGGTAGAGAC GTCTGGCAGC 

6490 6500 6510 6520 6530 

AGGCTTAAGG ATTCAATTTT AACAGAGATT AGCCGATCTG GCGGC6GGAA GGGAAAAAGG 

6550 6560 6570 6580 6590 

CGCGCCAGAA AGGCGCGCCA GGGATCAGAA GTCGGCTTTC AGAACCACAC G6TAGTTGGC 

6610 6620 6630 6640 6650 

TTTACCTGCA CGAACATGGT CCAGTGCATC GTTGATTTTC GACA7CGGGA AGTACTCCAC 

6670 6680 6690 6700 6710 

TGTCGGCGCA ATATCTGTAC GGCCAGCCAG CTTCA6CAGT GAACGCAGCT GCGCAGGTGA 

6730 6740 6750 6760 6770 

ACCGGTTGAA 6AACCC6TCA CGGC6CGGTC GCCTAAAATC AGGCTGAAAG CCGGGCACGT 

6790 6800 6810 6820 6830 

CAAACGGCTT CAGTACGGCA CCCACGGTAT GGAACTTACC GCGAGGCGCC AGGGCCGCAA 
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6850 6860 6870 6880 6890 

AGTAGGGTTG CCAGTC6AGA TCGACGGCGA CCGTGCTGAT AATCAGGTCA AACTGGCCCG 

6910 6918 
CCAGGCTTTT TAAA6CTT 
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